Trying to make AI not kill everyone

Berkeley
My current stance on AI is: Fucking stop. Find some other route to the glorious transhuman future. There’s debate within the AI alignment community re whether the chance of AI killing literally everyone is more like 20% or 95%, but 20% means worse odds than Russian roulette.
204
108
721
521,862
back when I was young, I thought it was unrealistic for the Volunteer Fire Department to schism into a branch that fought fires and a branch that started them
13
40
662
41,970
I wrote a book with Eliezer Yudkowsky. It’s about how smarter-than-human AI is on track to kill us all, written in hopes of bringing the conversation into the mainstream. Lots of people are alarmed; ~nobody wants to sound alarmist. The time is ripe. Preorders help. Link below.
58
92
650
137,306
It's weird when someone says "this tech I'm making has a 25% chance of killing everyone" and doesn't add "the world would be better-off if everyone, including me, was stopped."
48
52
547
127,143
It takes more cleverness to articulate a thought than to think it. If you're thinking at the limits of your abilities, you have thoughts you can't articulate.
22
24
482
"our AIs that can't do long-term planning yet aren't making any long-term plans to subvert us! this must be becaues we're very good at alignment."
13
20
378
60,234
The book I wrote with Eliezer Yudkowsky is now a New York Times Bestseller. I hope this is just the start of people around the world recognizing that the race to superintelligence is insanely reckless.
23
46
375
14,300
Reminder: my reason for expecting AI to go poorly is, deep down, not about alignment being ultra-hard, but about Earth beeing a very derpy place.
14
16
334
53,219
Whether or not criminals deserve due process is beside the point. The point is that due process is how the state determines whether or not someone is a criminal.
6
15
325
17,223
Relatedly, one (among many) of my beefs w/ modern schools (& culture more generally) is how much it harps on the dark aspects of humanity, and how little it highlights the light. Humans are rad. Humanity is rad. Sometimes hapless, sometimes evil, but overall, fuck yeah.
6
23
296
Americans: "I'll fucking do it, I'll cut my foot off" Canadians, revving chainsaw: "don't you dare. you think you're the only one who can cut their own foot off??"
12
8
285
30,598
Replying to @Aella_Girl
c'mon ladies, this is how we get kicked out of paradise
8
248
8,658
.@Aella_Girl, casually out of nowhere: "so I bought equipment to take a bath,"
9
3
245
35,865
It has come to my att'n that some of my friends are unfamiliar with the "humanity, fuck yeah" genre of writing, in which humanity is depicted as awesome (against an interstellar backdrop). Choice example: space-australians.tumblr.com…. Relevant subreddit: teddit.net/r/HFY/.
8
30
236
"Stop using phrases that meticulously track uncommon distinctions you've made; we already have perfectly good phrases that ignore those distinctions, and your audience won't be able to tell the difference!" No.
2
33
235
The definitional gynmastics required to believe that dolphins aren't fish are staggering.
17
47
218
To state my view plainly: Labs are racing to build smarter than human machines. Most in the field agree success would radically endanger civilization. It's mad for the citizens of Earth to let labs gamble with their lives like this. I appreciate efforts to establish red lines.
The time for AI self-regulation is over. 200 Nobel laureates, former heads of state, and industry experts just signed a statement: "We urgently call for international red lines to prevent unacceptable AI risks" The call was presented at the UN General Assembly today by Maria Ressa, Nobel Peace Prize laureate:
21
25
236
23,542
The reckless race to build superintelligence threatens people around the world, and I support them in saying "what the heck, this is crazy, stop gambling with my life."
A stunningly broad coalition has come out against Skynet: AI researchers, faith leaders, business pioneers, policymakers, NatSec folks and actors stand together, from Bannon & Beck to Hinton, Wozniak & Prince Harry. We stand together because we want a human future. #KeepTheFutureHuman
35
31
232
14,028
One of my big takeaways from the discussion on this thread is how many people don't understand how insanely powerful a sample size of 19k is. Like, yeah, the correlations are small, but her likelihood ratios for her 0.06 correlations (vs 0.00) are still like a quintillion to one.
Sexual fetishes on the political compass, men and women. Total sample size was over 19,000!
24
14
208
and people assure me that governments will start acting sane and reasonable around AI in the wake of "warning shot" accidents
Peter Daszak has received another grant from the NIH… …to study bat coronaviruses in the wild. After everything the world has just been though. After all the risky research that was supposed to protect us from a global pandemic failed to stop one.
7
22
197
it's even worse than @yashkaf depicts: big progress often comes from lots of small reconceptualizations. the "i can't distinguish your idea from a worse one in the literature" police are punishing real progress.
this is the best part of TPOT the fuck do I care that someone 200 years ago had the same realization and wrote it down somewhere? fucking good for them! what does it matter if I read it in a thread or a book or a thread quoting the book if it's the same idea?
6
10
191
Tim Urban (of Wait But Why) and I are gonna have a little chat about AI with a live audience Q&A on August 10th, for folks who preorder my forthcoming book. I think it'll be fun. Details below.
4
8
185
36,151
(ftr: I signed onto futureoflife.org/open-letter… because I think that the current path leads to destruction, and that the letter's suggestions are marginal steps in the right direction, not because I endorse all its arguments, nor because I think those steps would help all that much.)
5
13
174
33,815
The world is not made of arguments. Think not "which of whese arguments, for these two opposing sides, is more compelling? And how reliable is compellingness?" Think instead of the objects the arguments discuss, and let the arguments guide your thoughts about them.
4
22
173
Emperor, Prince, Tsar, and Kaiser are all titles that stem from Roman desperation to call their ruler anything other than King
10
6
174
29,879
I'm glad to see this initiative. I refrained from signing because it doesn't mention the danger of civilization being destroyed, which is the danger I strive to avert. But my guess is that many signatories worry about that danger too, despite not feeling quite able to say so yet.
The time for AI self-regulation is over. 200 Nobel laureates, former heads of state, and industry experts just signed a statement: "We urgently call for international red lines to prevent unacceptable AI risks" The call was presented at the UN General Assembly today by Maria Ressa, Nobel Peace Prize laureate:
12
7
174
13,342
You can't (validly) argue from "we don't know how many bullets are in the chamber of this revolver" to "so playing Russian roulette with this revolver is fine".
18
16
168
22,699
Replying to @littIeramblings
if EAs earnestly said "we're on track to die; yes some people have some plans but they're mostly predicated on the rest of the world drastically changing & that's looking less and less lkely; we're fucked", my guess is that'd be better than ~all their desperate plans combined
9
18
230
38,017
But if someone finds a revolver lying around, spins the barrel, and points it at your kid, then your reaction shouldn't be “no worries, we can’t assign an exact probability because we don't know how many rounds are chambered”. Refusing to act b/c the odds are unclear is crazy.)
4
7
159
16,053
A common misconception of Aella's research is that it's constructed from Twitter-polls of her followers. Nope! When she reports research results, she's talking about huge surveys of fairly diverse populations. (Much bigger and more diverse than is usual in academia!)
People often say that my research is "twitter polls." I do a ton of twitter polls, but I primarily use them to gauge what might be potentially interesting topics for more thorough surveys in the future! My actual research is stuff like this: aella.substack.com/p/who-too…
4
6
155
29,276
Thread about a particular way in which jargon is great:
4
33
156
Some are saying that the claim "we only get one shot at ASI alignment" requires AI capabilities to improve discontinuously. Nope! There's no sharp boundary between the atmosphere and outer space, but the difference is still enough to make space probes hard.
5
10
154
9,377
The UK edition of my forthcoming book has been finalized ✨
20
4
146
33,749
Me: who are you to say which one of the dishwasher and the clotheswasher is "the" washing machine Her dishes: shattering loudly during the spin cycle
6
9
144
I enjoyed all 16,000 of these convos with Hank Green piped.video/watch?v=5CKuiuc5…
10
14
153
18,225
(A complement I once got from a research partner went something like "you just keep reframing the problem ever-so-slightly until the solution seems obvious". <3)
2
7
134
It's weird when someone says "I think my complicated idea for preventing destruction of the Earth has some chance of working" and doesn't add "but it'd be crazy to gamble civilization on that."
4
5
136
6,273
This analysis of the path to AI ruin exhibits a rare sort of candor. The authors don't mince words or pull punches or act ashamed of having beliefs that most don't share. They don't handwring about how some experts disagree. They just lay out arguments. thecompendium.ai/#introducti…
4
17
133
15,224
if vampires are sexy humans, why aren't mosquitos sexy bugs?
15
8
113
It's a noteworthy omission, when people who think they're locked in a suicide race aren't begging the world to stop it.
1
5
116
3,433
reactions to this are like a microcosm of why you usually can't trust humans with consequentialism.
in a world of greater legibility, romantic partners would have the conversation about "I'd trade up if I found somebody 10%/25%/125% better than you" in advance, and make sure they have common knowledge of the numbers
13
5
109
59,444
It's like being in a room full of LEGO machines, and you look at the machine that reads instructions and assembles the other machines, and it's built not out of LEGO but out of cleverly contorted instruction booklets.
3
6
111
Yes, we have plenty of disagreements about the chance that the complex plans succeed. But it seems we all agree that the status quo is insane. Don't forget to say that part too.
1
5
113
5,384
Replying to @SteveMoraco
False tradeoff. If you have a revolver and you think four chambers are loaded with utopia and two chambers are loaded with lead, don't put it to your head and pull the trigger. Take the two bullets out first.
7
5
116
4,502
my reflexive response to wordle is the same as my reflexive response to 2048 and other such fads: treat it as a low-grade attentional hazard and ignore it until it fades. this has mostly worked out for me, except for pokemon, which apparently never fades
11
3
104
Replying to @paleochristcon
(I don't think you were seeing fake autism, I think you were seeing her use small steps and simple concepts for someone who was struggling with basic comprehension and civility)
1
1
108
3,655
The possible benefits from AI are great, but the benefits are significantly greater if we wait until we don’t have double-digit percent chances of killing literally everyone.
3
8
105
12,368
Replying to @HumanHarlan
kinda shocking how well-hidden they've managed to keep the dangers for all this time
1
107
4,200
3. a community is probably stronger when its members just blurt out their beliefs (while meticulously being kind to each other). it's much easier to lose your way if you live in a mental world where PR is king over honesty and integrity. HT @robbensinger
1
6
105
Say it loudly and clearly and often, if you believe it. It's perhaps the most important thing to say.
2
105
4,234
It's weird when AI people look inward at me and say "overconfident" rather than looking outward at the world to say "Finally, a chance to speak! It is true, we should not be doing this. I have more hope than he does, but it's far too dangerous. Better for us all to be stopped."
1
1
102
3,723
I think the new piece about me in Politico turned out okay. But set the record straight: I think FLI has been doing decently well at speaking with the courage of their convictions over the past couple years.
4
4
104
11,114
i am tickled by how the etymology of supervillain is essentially "better villager"
1
13
95
11,367
Also, while I'm on the topic: a fun hidden fact about Earth is that you don't actually need a license to collect and analyze data! No matter what the "do you have a degree" gatekeepers insinuate.
1
5
94
6,305
to make matters funnier, the second one is the one that's exaggerated and overblown
1
2
93
1,870
(In various podcasts I have been wrong about the date when the Stuxnet virus was developed. Oops. It was discovered in 2010, and was likely in development since at least 2005. Thanks to the person who corrected me.)
6
1
114
6,565
Civilization should say to these people: no, sorry, the (probabilistic) costs you’re imposing on us are too large, we will not permit you to endanger everyone like this, rather than waiting and attaining those benefits later, once we know what we're doing.
1
4
92
12,017
So what you're saying is... you're a shit-eating whore?
3
77
oops, I meant that ribosomes are mostly RNA. (RNA is ofc 100% RNA)
Replying to @So8res
did you mean ribosomes are made out of RNA?
3
88
the "curse of cryonics" is when a problem is both weird and very important, but it's sitting right next to other weird problems that are even more important, so everyone who's able to notice weird problems works on something else instead.
5
8
87
For people who present as caring a bunch about data integrity, they're weirdly unresponsive to the data on their pet theory that Aella's polling population differs radically from a bigger and more diverse survey population. (The data isn't kind to their theory.)
2
2
84
48,843
You can say that without even stopping! It's not even hypocritical, if you think you have a better chance than the next guy and the next guy is plowing ahead regardless.
1
90
2,875
in calculus it's convenient to work with infinitesimals: numbers so small that their square is zero. in computer science, we work instead with coinfinitesimals: numbers so large that their square is infinity. which're why CS folk care so much about avoiding quadratic runtimes.
6
91
“Don't worry, we'll watch for signs of danger and then do something unspecified if we see them" is the sort of reassurance labs give when they're trying to cement a status quo in which they get to plow ahead and endanger us all.
1
6
84
8,608
(If you're worried about *you personally* losing access to the future because you'll die of old age first, sign up for cryonics, and help improve cryonics technology. I, too, want everyone currently alive to make it to the future!)
5
3
89
28,374
it's notable that so many people object "but 'value' doesn't capture..." rather than cautioning "people might neglect the value of...". as if the word "value" must cover only the shallow and superficial features; as if no word is allowed to capture the deeper intangibles.
2
6
87
4,862
Replying to @vikhyatk
If everyone has ~0%, what's up with the lab heads and half the researchers and the Nobel-winning godfather of the field and the pre-existing nonprofits all saying "yeah this looks unprecedentedly dangerous", all while spelling out arguments that receive no standard counter?
1
1
90
938
Example: according to me, "my model of Alice wants chocolate" leaves Alice more space to disagree than "I think Alice wants chocolate", in part b/c the denial is "your model is wrong", rather than the more confrontational "you are wrong".
5
1
84
Picture putting a planet-sized revolver up against Earth, with one round chambered. That's akin to what companies (or gov'ts!) are doing when they build towards superintelligent AI at our current level of understanding. More than a 1 in 6 chance that literally everybody dies.
5
4
84
22,337
I am summonable for this sort of backup
If you think your community, political, EA, anything else, should be braver in what they say out loud, then maybe the biggest thing you can do is to show or promise that you'll support them publicly if they get piled on. (Even if you disagree and plan to say so!)
2
89
6,891
Reminder: folks who preordered my forthcoming book are invited to chill with @waitbutwhy and I on Sunday, while we discuss AI and take questions from the audience. 🔗👇
5
5
87
98,540
But more generally, civilization at large should not be accepting this state of affairs. Maybe you can't tell who's right, but you should be able to tell that this isn't what a mature and healthy field sounds like, and that it shouldn't get to endager you like this.
2
4
81
8,616
I suspect this phenomenon is one cause of jargon. Eg, when a rationalist says "my model of Alice wouldn't like that" instead of "I don't think Alice would like that", the non-standard phraseology tracks a non-standard way they're thinking about Alice.
1
2
83
I think this goes all the way back to calling this an "AI safety" problem, as if the AI's need seatbelts. Rather than talking frankly about how this is on-track to kill everyone, on an unknown timescale that is not known to be long.
1
5
88
7,278
I'm grimly amused that Earth seems perhaps "burned out" about pandemics; seems perhaps *less* likely to react quickly and competently than pre-COVID. (Which does not bode well for the "surely humanity will get its act together after a warning shot" theory of AI alignment.)
Can some people either start betting this market down or start panicking please? manifold.markets/NathanpmYou…
6
4
82
14,556
This is Aella, stealing Nate's phone. Its his birthday and I made him a rate-nate birthday survey! If you're familiar with Nate's personality even a little I'd love if you could fill it out. Gonna give him some graphs as a gift. guidedtrack.com/programs/y0f…
7
2
83
11,730
My internal language has a bunch of cool features that English lacks. I like these features, and speaking in a way that reflects them is part of the process of transmitting them.
2
2
79
which sure would explain why many people hate on consequentialism; [legible-consequence]alism is a much worse moral theory than [comprehensive-consequence]alism.
5
1
80
25,862
ok my new theory is that girls can both have a feeling active *and* be forming words at the same time. and this is just, like, how they live their lives
6
3
77
I thought @krystalball had some especially astute questions. A few different friends reached out to me and said that this was their favorite interview yet.
Replying to @MIRIBerkeley
On Breaking Points, @krystalball and @ryangrim interview @So8res. One of the best overview interviews to date, with some excellent and sharp questions. piped.video/watch?v=3YdtNlja…
4
7
83
8,070
Big ambitions are for prioritizing between projects you'd love to work on, not for gatekeeping your enthusiasm.
1
3
78
pretending like you've got a plan when things look this bad interferes with the ability of other people to make sense of the situation. I think that this sort of bulshittery is how you wind up with the Paris AI "Action" summit
2
1
102
3,257
If I were designing a language, I would not render it easy to assign properties like "correct" to a whole person -- as opposed to, say, that person's map of some particular region of the territory.
4
4
78
If you think you have proofs of both A and ¬A, think not "which proof is more persuasive?". Instead, observe that you are mistaken. Either the two statements are not in fact opposed, or one supposed-proof contains a flaw. Don't weigh proofs; seek flaws. So too with arguments.
3
7
80
Continuing: these people who are playing Russian roulette with the planet have no credible offer that’s worth enough that they should be putting all our lives at such grave risk.
1
1
76
11,459
Another modern battle ground is "berry". Protip: if your new proposed definition of "berry" includes neither strawberries nor raspberries then it is a BAD PROPOSAL. You can tell by how "strawberry" and "raspberry" have "berry" in the name.
2
7
73
You don't have to have slightly different beliefs from others, to show that you're cool. You can just adopt others' beliefs wholesale, if they seem right.
4
1
79
(I'm on the "more like 95%" side myself, but this thread is gonna be about how I recommend non-experts respond to the situation, and I think "> 1/6" is both more obvious and is sufficient for my purposes here.)
6
1
75
25,155
My take on RSPs: it is *both* true that labs committing to any plausible reason why they might stop scaling is object-level directionally better than committing to nothing at all, *and* true that RSPs could have the negative effect of relieving regulatory pressure.
1
12
75
22,092
Replying to @So8res @HumanHarlan
"We'll be fine (the pilot is having a heart attack but superman will catch us)" is very different from "We'll be fine (the plane is not crashing)". I worry that people saying the former are assuaging the concerns of passengers with pilot experience, who'd otherwise take the cabin
2
14
170
14,569
If the labs were coming right out and saying: “Yes, we’re endangering all your lives, with >1/6 probability, but we believe that’s OK because the benefits are sufficiently great / we believe we have to because otherwise people that you like even less will kill everybody first,"
2
5
72
28,508
can't tell whether there's only two sex positions that everybody pretends are lots of different positions, or
6
4
74
Cucumber? Central example of a vegetable. Strawberry? Central example of a fruit. If you're having trouble figuring out which is which, ask some local children to help you out.
3
6
69
2. people and institutions lauded as genius are often held together by only bubble-gum, wishes, and a favorable environment. if you rely on those people/institutions to accomplish great feats of competence under pressure, you're in touble.
1
4
71
Enough model-building, more shitposting: the US should start taking that "all [people] are created equal" clause seriously, and declare that everyone in the world has US citizenship if they but wish it so.
3
6
70
"Have licenses" and "run evals" are fine suggestions, they’re helpful, but they’re not how a sane planet responds to this level of horrific threat. The sane response is to shut it down entirely, and find some other route.
2
2
70
9,598
Yet somehow, once we figured out about genealogy, the pedants were like "well actually this fish's uncle was a fuzzy pigdear, so it's not actually a fish, you uneducated idiot, you absolute moron" and then we all forgot what "fish" meant out of sheer shame or something???
2
4
65
The world is full of people who will say they've "taken appropriate safety measures" or otherwise give a useless superficial response before plowing on ahead without addressing the deeper underlying issues.
2
1
64
7,717
it seems many people intuitively think that words like "value" can only apply to the legible and easily articulable aspects of things.
2
5
68
28,723