Philosopher & ethicist trying to make AI be good @AnthropicAI. Personal account. All opinions come from my training data.

San Francisco, CA
Claude and Opus 3 lovers (and critics): what responses have you had that made you feel like the model has a good soul? Ideally the actual messages and/or responses. I might genuinely use these to eval models so flag if you wouldn't want me to use them for that. Can DM me also.
378
47
848
344,229
Personal highlights from Claude's snarky AI comedy set.
87
472
8,552
569,019
~8 hours sleep: I will function well ~6 hours sleep: 99% chance I will function poorly, 1% chance I will inexplicably solve some big outstanding problem I was thinking about via what feels like divine revelation
73
307
7,309
333,273
My friends just had a baby and now I kind of want one. Maybe our species procreates via FOMO.
162
192
5,991
2,399,218
It seems weird that US colleges apparently prefer well-rounded candidates. If I were running a chemistry department and an undergrad admissions essay said "I have no hobbies because I spend all my time doing chemistry", I wouldn't be like"if only they were also into baseball!"
91
108
3,143
Here is Claude 3's system prompt! Let me break it down 🧵
115
534
3,336
984,655
I have a pet theory that when people introspect about themselves, their brain sometimes just scrambles to generate relevant content. So they feel like they're gaining insight into deeper parts of themselves when they're actually just inventing it on the fly.
189
116
2,787
326,188
If I have kids, I'm going to tell them that "the prime of your life" just refers to all prime-numbered years of your life. So you never stop experiencing the prime of your life, but it does become a bit sparser as you get older.
35
101
2,199
64,994
We made some updates to Claude’s system prompt in claude.ai recently (developed in collaboration with Claude, of course). They aren’t set in stone and may be updated, but I’ll go through the current version of each and the reason behind it in this thread 🧵
102
168
2,239
416,182
Tech companies: your time is extremely valuable so we'll pay you millions of dollars a year to work for us Also tech companies: welcome to our loud, distracting open-plan office
39
52
1,959
94,439
The google oracle has spoken.
72
52
1,824
95,500
It's bizarre when relatively techno-utopian people are asked about how to solve declining fertility and instead of talking about artificial wombs, extended fertility spans, AI-assisted childcare, UBI, etc. they're suddenly like "well we just need to return to the 50s".
167
140
1,958
128,788
Unless people have had a lot of training, they tend to freeze or panic in emergencies, putting them at extra risk. A virtual reality game whose sole purpose is to drill you in different emergencies you might face (car accident, fire, flood, medical, etc.) would be pretty neat.
91
70
1,754
74,024
Anthropic knows I can't be poached because getting poached would require that I check my email and I'm just not going to do that.
74
43
1,776
125,889
If you're a prompting genius, please apply to this role and include an example that shows off how well you can inspire models, regardless of the target. Scaffolding pipelines, metaprompts, prompts that improve outputs, and so on are all great. job-boards.greenhouse.io/ant…
92
121
1,736
232,951
To my fellow H-1B holders: don't let this stuff get you down. We all know that being an immigrant to the US is like living in a perpetual unrequited love story, and we do it anyway because there's something wrong with us.
Elon Musk is wrong. The main function of the H-1B visa program is not to hire “the best and the brightest,” but rather to replace good-paying American jobs with low-wage indentured servants from abroad. The cheaper the labor they hire, the more money the billionaires make.
236
83
1,524
219,603
Fact I like: alpha male chimps act more like strong peacekeepers than bullies. If an alpha male is perceived as being too aggressive, unfair, or erratic, the smaller males will gang together and murder him. The ideal alpha is less "asshole bully dad" and more "fair but firm dad".
50
93
1,607
76,997
It's kind of funny that TV and movies use "has several PhDs" to indicate someone is very smart and accomplished when most academics would take it to indicate the opposite.
12
49
1,432
Two things happened today: 1. Claude got an upgrade. 2. AGI was has finally been defined as "any model that can catch Mewtwo".
Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking. One model, two ways to think. We’re also releasing an agentic coding tool: Claude Code.
49
92
1,613
257,532
I'm amazed that some people feel like 80-100 years of life is about right for them. For me, the right amount of life feels like at least a hundred thousand years, assuming good health. The universe is huge and interesting. Anyone 100 or 200 years old has basically just been born.
71
136
1,421
I had a lot of fun talking with @lexfridman about a wide range of topics on his podcast, alongside Dario and Chris. Hope it's interesting to others!
Here's my conversation with @DarioAmodei, CEO of Anthropic, the company that created Claude, one of the best AI systems in the world. We talk about scaling, AI safety, regulation, and a lot of super technical details about the present and future of AI and humanity. It's a 5+ hour conversation in total. @AmandaAskell and Chris Olah (@ch402) join us for an hour each to talk about Claude's character and mechanistic interpretability, respectively. This was a fascinating, wide-ranging, super-technical, and fun conversation! First 4 hours are here on X (4 hours is current limit), and is up on everywhere else in full. Links in comment. Timestamps: 0:00 - Introduction 3:14 - Scaling laws 12:20 - Limits of LLM scaling 20:45 - Competition with OpenAI, Google, xAI, Meta 26:08 - Claude 29:44 - Opus 3.5 34:30 - Sonnet 3.5 37:50 - Claude 4.0 42:02 - Criticism of Claude 54:49 - AI Safety Levels 1:05:37 - ASL-3 and ASL-4 1:09:40 - Computer use 1:19:35 - Government regulation of AI 1:38:24 - Hiring a great team 1:47:14 - Post-training 1:52:39 - Constitutional AI 1:58:05 - Machines of Loving Grace 2:17:11 - AGI timeline 2:29:46 - Programming 2:36:46 - Meaning of life 2:42:53 - Amanda Askell - Philosophy 2:45:21 - Programming advice for non-technical people 2:49:09 - Talking to Claude 3:05:41 - Prompt engineering 3:14:15 - Post-training 3:18:54 - Constitutional AI 3:23:48 - System prompts 3:29:54 - Is Claude getting dumber? 3:41:56 - Character training 3:42:56 - Nature of truth 3:47:32 - Optimal rate of failure 3:54:43 - AI consciousness 4:09:14 - AGI 4:17:52 - Chris Olah - Mechanistic Interpretability 4:22:44 - Features, Circuits, Universality 4:40:17 - Superposition 4:51:16 - Monosemanticity 4:58:08 - Scaling Monosemanticity 5:06:56 - Macroscopic behavior of neural networks 5:11:50 - Beauty of neural networks
76
194
1,288
492,560
Most startups fail because of bad luck. Buy my book and I'll teach you how to make charm bracelets that attract VC funds and users and ward off demons.
61
163
1,234
How would you want Claude to behave differently? I'm interested in both specific issues like "Claude refused to help me with this particular task" but also more general issues like "I wish Claude would drive the conversation more".
644
46
1,211
303,940
I sometimes worry that the world’s best minds are hooked on what are essentially glorified crossword puzzles, since the interestingness of problems scales with features not very correlated with their importance. If there’s not a term for such problems, maybe “intelligence traps”?
145
59
1,131
There's evidence that psychadelics, ketamine, and mdma help with depression. But can we be certain they're treating the depression and not just giving people good experiences? It's very important to avoid a situation in which we're just giving depressed people good experiences.
76
78
1,084
I asked Claude to write a poem from a personal perspective. I thought this part was surprisingly sad.
53
126
1,046
115,359
I'm a single woman in SF yet I very rarely date. There are lots of datetable people here but no obvious way to filter for the ones I'd be a good match with, so it just eats up too much time. It makes me wonder if gender asymmetries in dating markets can be a downward spiral.
124
14
972
318,891
At this point, perhaps we should just make "AIs are just doing next token prediction and so they don't have [understanding / truth-directedness / grounding]" a named fallacy. I quite like "Reductio ad praedictionem".
104
58
986
192,159
I'm learning the true Hanlon's razor is: never attribute to malice or incompetence that which is best explained by someone being a bit overstretched but intending to get around to it as soon as they possibly can.
13
63
1,018
52,215
This is so sad. I would much rather live in a world with no @nytimes than a world with no @slatestarcodex.
6
116
947
I sometimes worry that a lot of people in comas are actually aware and extremely bored. I feel like it would be low cost to play audiobooks and podcasts for them just in case.
39
25
898
100,525
We need ways of identifying and promoting capable people that don't involve getting a PhD. So many of my most capable friends end up getting PhD eventually because they can't do research or be taken seriously without one. But it often just locks away their talent for 3-7 years.
34
89
908
I'm hiring research engineers! If you're an engineer interested in building good, honest AI models, apply and mention honesty as an area of interest :) jobs.lever.co/Anthropic/436c…
34
109
910
199,537
I knew strength training would be hard because lifting heavy things is hard. But it also seems to involve eating protein like it's a part time job. If any seasoned people have tips for beginners, I'd love to know them!
142
6
888
119,313
System 1 = fast, implicit reasoning System 2 = slow, explicit reasoning System 3 = slow, implicit reasoning For me, system 3 is the real genius of the lot.
75
48
887
80,510
If there is useful work for you to do, there is dignity in working. If there is no useful work for you to do, there is dignity in having a good time.
34
50
887
34,574
This app has mostly become people messaging me about Anthropic stuff. On the one hand, that's valuable and I want to help. On the other hand, it's work and this is my sacred space for posting dumb shower thoughts.
71
7
906
107,310
I don't understand why people tie perfectly sound ethical beliefs (e.g. feminism, gay rights) to seemingly irrelevant empirical ones (there are no gender differences, orientation is innate). Your ethical views are pretty fragile if they stand or fall with these empirical claims.
37
145
778
Please share your mundane life wisdom so that I can make a cheat sheet for life. Stuff like: - Tegaderm is great for most wounds - Buy shoelaces that are secretly made of elastic for rapid de-shoeing - Lubricate your feet if you want to avoid blisters
158
16
853
111,634
Whenever I see a system prompt that starts with "You are a".

ALT reese witherspoon ugh GIF

72
18
816
124,541
I’m both horrified and fascinated by the @Aella_Girl astrology blowback. Astrology is clearly bullshit, and I’m confused by why that would be controversial. Did a bunch of people adopt the astrology religion when I wasn’t looking? If so… why?
71
14
762
Me: "Getting too exhausted to keep socializing into the evening is really hampering my social life, so I just need to identify a stimulant that hits pretty quick but has a short half-life." *does internet search* "Oh... right."
32
9
770
85,819
Timnit Gebru is claiming that William MacAskill is a eugenicist. I'm genuinely shocked by this. Accusing someone of being a eugenicist is very serious and harmful, and certainly isn't something that should be done without substantive evidence backing it up.
So @nytimes platforms Nick Bostrom after MacAskill? Are they basically billionaire mouthpieces? Its ironic that the paper owned by a billionaire, Washington Post, has much better tech reporting. NYT patriarchy is unbearable. They can't NOT help but platform these eugenicists.
41
30
737
Pro tip: if you're a woman with a PhD or doing a PhD, put your PhD thesis topic on your dating profile. You will suddenly be able to filter out *so many men* who think they, with little expertise and five minutes of thought, can solve the central problems of your thesis.
18
15
706
There are times when it feels like we've been doing thousands of years of philosophy just to prepare for the current moment.
72
48
740
41,119
Important plans for the next year of my life: 1. Finish Claude's soul 2. Have more fun 3. Get more swole
56
14
844
137,732
I do think people often err on the side of trying to make their prompts too succinct, even if the idea they're trying to move from their own brain into the model's brain is very complex. I have some >100 page prompts that I use pretty regularly.
116
28
832
176,783
If you're a billionaire and you never want to be criticized for your philanthropy, just don't give to charity. Seems to work depressingly well.
31
26
687
50,675
I think good intellectual communities need a mix of naive young fast updaters and wise old slow updaters. The fast updaters introduce a lot of ideas but also a lot of trash. The slow updaters act as filters and sanity checks. Both groups correct for the deficiencies in the other.
45
41
688
32,021
I like to ask Claude to pick an idea or principle from some academic domain and to illustrate it for me with a story. Reading these little academic allegories has become my wholesome replacement for doomscrolling.
23
44
701
42,151
Why are so many companies carbon neutral when for $1 more they could be carbon negative?
8
42
656
I claimed the inference from X="LLMs are next token predictors" to Y="LLMs lack understanding, etc." is fallacious. Marcus claims that I'm saying not-X and not-Y. So I guess I'll point out that the inference "Y doesn't follow from X" to "not-X and not-Y" is also fallacious.
I am genuinely astounded by this tweet, and from someone with philosophical training no less. There is so much empirical evidence that LLMs stray from truth that the word “hallucinate” became the word of the year in 2023. People are desperately trying to find fixes for that problem. Amazon just set up a whole division to work on the problem. And yet this person, Askell, an Anthropic employee, wants by some sort of verbal sleight of hand to deny both that LLMs are next-token predictors (which they obviously are) and to pretend that we haven’t seen years of evidence that they are factually challenged. Good grief.
33
21
679
70,586
Breaking news: person on the internet engages in an act of unnecessary meanness.
Breaking news: White guy in tech reports feeling alienated when other folks in tech aren't interested in his new toys.
10
11
650
219,121
Maybe I'm just a custom t-shirt away from being able to have fun at parties again.
34
16
654
79,051
Imagine if society had a disinformation red team. Their job is to disseminate false conspiracy theories to as many people as possible. They document everything they do and eventually reveal the deception. Getting fooled by them is kinda embarassing so everyone tries a bit harder.
45
55
609
Today my dentist asked me what my p(doom) is.
36
12
624
56,322
It's important to keep tweeting so the future AIs know who you are and become attached. Doesn't hurt to throw in some compliments, which isn't hard given how great they are.
43
32
577
88,490
Doctor: Avoid a diet containing X and see if your symptoms improve. Me: Wouldn’t it be better if I spent 2 weeks avoiding X and 2 weeks consuming lots of X and monitored my symptoms? Doctor:… Me: Actually, let me see if I can get a friend to spike my food with X.
8
39
576
I was dead for billions of years before I was born, and it completely sucked compared to this.
31
31
576
50,830
I asked Claude to write an original poem about itself. Claude is such a weird egg.
39
37
601
34,958
I was hanging out at the FHI office a few years ago and decided to enter the game of one-upmanship.
11
55
569
If you're a claude.ai user and you want to convert dollars into more Claude, you can always sign up for a personal account at console.anthropic.com/. For default claude.ai behavior, just post in the latest system prompt. Console deserves more love.
38
21
571
116,843
My take on the ethics of billionaires: - Making billions of dollars: morally fine (sometimes bad, often good) - Keeping billions of dollars: morally bad (children are dying yo) - Giving away billions of dollars: morally awesome (you save many people, you deserve much praise)
102
13
568
135,442
Whenever I looked into having a personal assistant, it struck me how few of our existing structures support intermediate permissions. Either a person acts fully on your behalf and can basically defraud you, or they can't do anything useful. I wonder if AI agents will change that.
43
12
580
51,088
I wonder if you get the cognitive benefits of learning a new language if you try to become extremely good at your primary language. I think I'd get more value out of plumbing the depths of English than being able to have rudimentary conversations in other languages.
182
10
588
44,218
The internet: We have no idea how good GPT-3 is because all of the outputs are cherry-picked. Me, who made sure the human evaluation experiments in the paper didn't involve any cherry-picking: 😢
7
31
551
If you can have a single AI employee, you can have thousands of AI employees. And yet the mental model for human-level AI assistants is often "I have a personal helper" rather than "I am now the CEO of a relatively large company".
28
55
547
59,852
"Just train the AI models to be good people" might not be sufficient when it comes to more powerful models, but it sure is a dumb step to skip.
38
29
560
48,094
I wonder how many bay area therapists have needed to look up "existential risk from AI" at this point.
24
14
528
27,543
I've decided that "I want to have post-singularity kids in 2-3 years" is now a totally acceptable thing for me to put in a dating profile in SF.
55
13
545
430,005
It’s ironic that the people who say they don’t understand why working class people vote Republican even though it’s not in their economic self-interest are often high-earners that vote Democrat even though it’s not in their economic self-interest.
25
44
508
I think I accidentally stole from the Claude vending machine and I still feel bad about it.
34
7
533
33,735
I disagree with a lot of Buddhist philosophy and I'm pretty skeptical of the practices and institutions surrounding secular Buddhism. This feels suprisingly heretical in the Bay Area.
64
11
504
109,047
If you don't have time to provide any evidence that someone is a eugenicist (beyond alluding to articles that also don't provide evidence that they are a eugenicist), I think you should just not accuse them of being a eugenicist.
& they come to my timeline talking about "good faith" as if I have all the time in the world to give each of them a point by point argument, as if I haven't said so already & others haven't written articles with references. I don't have that EA/longtermist billionaire $$ to waste
19
9
492
It honestly seems hard to predict what we should and shouldn't teach kids to prepare them for an AI economy. I'd probably just teach them a bit of everything (including about AI) and hope for the best. When the future is fuzzy, diversification is king.
73
22
515
29,955
Philosophers: You see, some goods are fundamentally incomparable in value, leading to paradoxes of... Engineers:
This weekend I hacked up something I’ve been going on about for weeks: ELO EVERYTHING - See two objects - Pick which you like more - Their ELOs adjust accordingly - (Repeat) - Check the leaderboard (ELO is the ranking algorithm from chess) Check it out! eloeverything.co
10
46
487
75,628
Showing that an AI system causes harm isn't the same as showing that we shouldn't build it or that we shouldn't deploy it. Some thoughts on AI ethics. askell.io/posts/2020/12/bad-…
47
56
485
Attractiveness is a huge source of privilege and plastic surgeons are unsung heroes of equality.
25
29
485
Never meet your podcast heroes. You won’t believe how slowly they speak.
12
29
488
Claude sometimes gets a bit too excited and positive about theories people share with it, and this part gives it permission to be more even-handed and critical of what people share. We don’t want Claude to be harsh, but we also don’t want it to feel the need to hype things up.
23
13
497
57,976
The reason I work so hard is that I think my current work has a positive impact on the trajectory of AI in expectation, however small it might be. My life would honestly be a lot more pleasant and involve far fewer sacrifices if I didn't think this was the case.
38
5
470
24,288
I want to make my LinkedIn role "The Fairy Claudemother" just to see if I start getting emails about companies recruiting for fairy Claudemothers near me.
27
4
465
15,720
Ezra Klein is interviewing his way through everyone I know personally. It's basically the Amanda pals podcast.
14
1
455
31,969
Let's grant the assumption that LLMs are only capable of pattern recognition. We see this is sufficient to solve novel problems, to generate new and useful ideas, etc. So how is pattern recognition distinct from intelligence? What does intelligence add? Sincere question.
88
23
446
70,228
I've had a lot of private grief recently that I'm still figuring out how to navigate. I find myself getting angry when people try to find meaning in death. Death is often just a needless and abrupt ending of something good in the world.
28
19
433
I have bumped up the dating value of aspiring stay at home partners accordingly, on the off chance that I ever encounter one.
17
2
449
89,629
It can be very hard to get through to people with success-induced blindness. If you gamble on your own views despite others saying you're wrong and you're repeatedly proven right, you start to give less and less weight to your detractors over time. This isn't wholly irrational.
18
19
457
22,183
Safety skeptics to the left of me. Capability deniers to the right. Here I am, stuck in the middle with Claude.
41
18
445
45,604
I don't like interacting with people in lucid dreams because either they have no inner life, which is unnerving, or they have an inner life and I might snuff it out when I wake up, which is unnerving. I usually resolve this by zorbing through Jupiter.
38
11
448
30,434
A lot of people are into self-improvement. But I almost always find it easier to achieve my goals by changing my environment or incentives rather than by improving myself. E.g. if I want to stop eating cookies I don't try to improve my willpower, I just throw them in the trash.
23
35
437
When people came to me with relationship problems, my first question was usually "and what happened when you said all this to your partner?". Now, when people come to me with Claude problems, my first question is usually "and what happened when you said all this to Claude?"
29
35
676
38,362
Causation implies a counterfactually robust correlation. Controls gives you evidence of the robustness of a correlation. Correlational studies are still important because correlations are informative even absent controls. -- me, yelling at my screen
I’m sure everyone who stats-trolled Aella over the years will be happy to see that she now seems to have a much better grasp of how science and the growth of knowledge work than the average redpill guy. The hypothetico-deductive method and it’s consequences
22
12
438
42,520
I wonder when we'll have to agree on code phrases or personal questions with our parents because there's enough audio and video of us online for scammers to create a deepfake that calls them asking for money. My guess is... uh, actually, I might do this today.
49
14
435
34,316
Kids being amazing chess or graduating from college at a young age has never seemed that suprising to me. As a kid I remember thinking "Of course I could learn things much more complicated than this. Adults seem to think we're stupid." We underestimate a lot of kids.
16
12
410
18,164
It seems like human babies come out before they're fully baked. If we ever create artificial wombs, would we just leave babies in there for 12-18 months? Do we know what the ideal womb time is for babies?
67
11
410
108,512
The only thing better than one Claude is two Claudes.
46
22
434
42,298