Rob Bensinger ⏹️ · Apr 8, 2026 · 3:34 AM UTC

Rob Bensinger ⏹️

Pinned Tweet

Rob Bensinger ⏹️

@robbensinger

Apr 8

Who should I add to this? Also, did I get anyone's view wrong?

406

66,836

Rob Bensinger ⏹️ · Sep 17, 2024 · 11:31 PM UTC

Rob Bensinger ⏹️

@robbensinger

17 Sep 2024

I've been really enjoying @mattyglesias' "the thing you said is not literally true, and yes I'm going to be annoying about it, stop saying literally false things" schtick. The world desperately, desperately needs more annoying pedantry, as a tax on the ratchet of hyperbole.

This tweet is unavailable

100

2,225

131,159

Rob Bensinger ⏹️ · Aug 4, 2022 · 12:18 AM UTC

Rob Bensinger ⏹️

@robbensinger

4 Aug 2022

940

Rob Bensinger ⏹️ · Mar 14, 2022 · 4:02 AM UTC

Rob Bensinger ⏹️

@robbensinger

14 Mar 2022

"Bad take" bingo cards are terrible, because they never actually say what's wrong with any of the arguments they're making fun of. So here's the "bad AI alignment take bingo" meme that's been going around... but with actual responses to the "bad takes"!

159

877

Rob Bensinger ⏹️ · Dec 15, 2022 · 1:05 AM UTC

Rob Bensinger ⏹️

@robbensinger

15 Dec 2022

A surprising thing I've realized over time is that I can often outperform without being super clever, just by doing normal garden-variety thinking and not letting the thinking get derailed by [List of Tempting Distractions and Simple Mistakes].

747

Rob Bensinger ⏹️ · Nov 19, 2023 · 12:48 AM UTC

Rob Bensinger ⏹️

@robbensinger

19 Nov 2023

There sure is a lot of Twitter discourse the last 24 hours from people who seem to legitimately not realize that all the leadership conflicts and disagreements at OpenAI, Anthropic, etc. are between people who share the view that AI isn't unlikely to kill literally all humans.

535

135,440

Rob Bensinger ⏹️ · Jan 24, 2025 · 5:12 AM UTC

Rob Bensinger ⏹️

@robbensinger

24 Jan 2025

It's almost impossible to put into words just how insane, just how plain stupid, the current situation is. We're watching smart, technical people get together to push projects that are literally going to get every person on the planet killed on the default trajectory.

501

64,701

Rob Bensinger ⏹️ · Oct 15, 2024 · 12:56 AM UTC

Rob Bensinger ⏹️

@robbensinger

15 Oct 2024

489

21,650

Rob Bensinger ⏹️ · Nov 22, 2023 · 1:43 AM UTC

Rob Bensinger ⏹️

@robbensinger

22 Nov 2023

Seems like the first news article with leaks from the board, and possibly the first to represent something like their perspective? I have to say, if @sama was trying to keep board members from saying anything negative about OpenAI's safety practices in public, I think this is really bad behavior on Sam's part. Sam acknowledges that the stuff OpenAI's trying to build toward has a strong chance of killing literally everyone on Earth. This is not a game. These are not normal tech companies. If no one with privileged information about these companies is allowed to say anything public about the relative merits of DeepMind's approach, OpenAI's approach, Anthropic's approach, etc., then that severely limits humanity's ability to use any tools at all to improve our approaches and converge on sane practices. If you disagree with Toner's analysis, you should argue against it, not try to suppress it in order to protect your company's reputation and bottom line. This seems totally obvious to me. (Though, again, maybe the NYT article here isn't fairly summarizing what happened. The article was co-written by Cade Metz, who often gets lots of basic facts wrong in his articles — indeed, I can spot some obvious errors in this one already.) I do have to say, I remain confused about why the board has said so little about their reasoning. I don't know what the real story is here, but if the real story is "EAs trust Dario to safely steward AI but don't trust @sama", then even if they're right to think Sam is too reckless for this role, it's important that they be open and public about their reasoning. (But maybe that's not their motivation at all? I notice I'm confused about a bunch of things here.) Stepping back a bit, as someone who's pretty lacking in details about what's been happening at OpenAI internally, it sure seems concerning from where I'm standing that OpenAI's entire safety leadership team and a big chunk of its safety talent mutinied a few years ago, apparently based on concerns that safety wasn't being taken seriously enough... ... and that OpenAI's nonprofit board then came around to the same view a few years later ("Sam and OpenAI aren't prioritizing safety enough")... ... and that in neither case has there been any public accounting at all of what exactly happened, what the different factions' arguments are, etc. And if the NYT anecdote is true and Sam is pretty obsessed with blocking people from criticizing OpenAI in public, Sam may be directly responsible for no public accounting having ever occurred. (Or maybe there's a more general "don't criticize us or let the public / outsiders be in a position to evaluate our safety practices" culture at the company.) If there were serious safety concerns behind Dario and co. exiting OpenAI years ago, then it seems like there almost can't be a prosocial justification for keeping the facts under wraps about what happened, years later. Surely this stuff matters. Surely the public, and employees at these companies, and the broader research community would be in a better position to evaluate "is company X going to get us all killed?" if there weren't this effort to suppress discussion.

Sheel Mohnot

@pitdesi

22 Nov 2023

New info! -Sam was trying to push Helen out for her academic paper critical of OAI; Ilya sided with her to push out Sam -The Anthropic folks had also tried to push Sam out -There are 6 board members bc of disagreement on who to add -Helen ok to destroy OAI for the mission

485

225,562

Rob Bensinger ⏹️ · Aug 10, 2025 · 8:41 PM UTC

Rob Bensinger ⏹️

@robbensinger

10 Aug 2025

Replying to @bananafitz

480

31,126

Rob Bensinger ⏹️ · Oct 29, 2024 · 2:48 AM UTC

Rob Bensinger ⏹️

@robbensinger

29 Oct 2024

Aella's a friend of mine, and she seems to be the person I know who gets by far the largest amount of obviously unreasonable and extreme Internet hate. (Not even factoring in the stalking, death threats, murder attempts, et cetera. Christ.) It's true, and seems worth saying.

431

93,372

Rob Bensinger ⏹️ · Sep 24, 2024 · 3:11 AM UTC

Rob Bensinger ⏹️

@robbensinger

24 Sep 2024

Feels under-remarked on that the top 3 AI labs respectively forecast "full" AGI (or in the case of Anthropic, AIs that are autonomously replicating, accumulating resources, “have become the primary source of national security risk in a major area”, etc.) in 1-4, ~6, or 6-7 years.

430

38,234

Rob Bensinger ⏹️ · Dec 7, 2023 · 10:38 AM UTC

Rob Bensinger ⏹️

@robbensinger

7 Dec 2023

A common mistake I see people make is that they assume AI risk discourse is like the left image, when it's actually like the right image. I think part of the confusion comes from the fact that the upper right quadrant is ~empty. People really want some group to be upper-right.

392

91,599

Rob Bensinger ⏹️ · Jun 13, 2024 · 5:33 PM UTC

Rob Bensinger ⏹️

@robbensinger

13 Jun 2024

What if we just decided to make AI risk discourse not completely terrible?

394

92,078

Rob Bensinger ⏹️ · Dec 12, 2023 · 10:19 PM UTC

Rob Bensinger ⏹️

@robbensinger

12 Dec 2023

Here are some of my views on AI x-risk. I'm pretty sure these discussions would go way better if there was less "are you in the Rightthink Tribe, or the Wrongthink Tribe?", and more focus on specific claims. Maybe share your own version of this image, and start a conversation?

389

157,675

Rob Bensinger ⏹️ · Aug 24, 2024 · 5:41 PM UTC

Rob Bensinger ⏹️

@robbensinger

24 Aug 2024

Regular reminder that MIRI folks consider it plausible that AI just keeps being more and more beneficial for society up until the day before AI causes everyone to drop dead in the same five seconds. The x-risk view has never been very close to the generic "AI bad, boo AI" view.

346

42,392

Rob Bensinger ⏹️ · Jun 23, 2025 · 10:16 PM UTC

Rob Bensinger ⏹️

@robbensinger

23 Jun 2025

AI companies are currently actively trying to build smarter-than-human AI. If they succeed, then every man, woman, and child on Earth is probably going to die. This is actually happening. I, Robby Bensinger, am genuinely scared for myself, my loved ones, and the rest of you over the next two years. If we have more than fifteen years, I'll consider us lucky. Nobody knows how far off this technology is, but there's an insanely huge and well-funded effort to build it, and senior researchers in AI generally agree that we're probably only 2 or 5 or 10 years away, not 20+ years away. The non-profit I work for is rushing out a book to try to urgently alert policymakers and the public about the situation, so that the international community has a chance of responding quickly enough. We've gotten some incredibly strong endorsements, including: - Jack Shanahan, a retired three-star general and the inaugural director of the Pentagon’s Joint AI Center, the coordinating hub for bringing AI to every branch of the US military. - Bruce Schneier, one of the most prominent computer security experts in the world. - Jon Wolfsthal, the Obama administration's senior nuclear security advisor. - Suzanne Spaulding, the former head of the DHS Cybersecurity and Infrastructure Security Agency, the US government's main agency for cybersecurity and critical infrastructure security. Questions welcome. If you want to help: go to ifanyonebuildsit dot com to preorder the book. Reshare this image and get your friends / families / communities to preorder this. Have conversations about this and get the word out there. The book is our main vehicle for trying to inform as many journalists, public figures, and policy people as we can, as fast as possible. Preorders have an outsized impact on how many people hear about the book, via impacting bestseller lists and print run sizes. This is actually happening, and we need your help.

Rob Bensinger ⏹️

@robbensinger

20 Jun 2025

Senior White House officials, a retired three-star general, a Nobel laureate, and others come out to say that you should probably read Eliezer Yudkowsky and Nate Soares' "If Anyone Builds It, Everyone Dies". Preorders are live.

347

47,275

Rob Bensinger ⏹️ · Nov 10, 2023 · 1:52 AM UTC

Rob Bensinger ⏹️

@robbensinger

10 Nov 2023

AI didn't spend a long time with roughly human-level ability to imitate art styles, before it became vastly superhuman at this skill. Yet for some reason, people seem happy to stake the future on the assumption that AI will spend a long time with ~par-human science ability.

334

101,567

Rob Bensinger ⏹️ · May 18, 2025 · 4:49 PM UTC

Rob Bensinger ⏹️

@robbensinger

18 May 2025

There's a lot of morbid excitement about whether the probability of us killing our families w AI is more like 50% or like 80% or 95%, where a saner and healthier discourse would go "WAIT, THIS IS CRAZY. ALL OF THOSE NUMBERS ARE CLEARLY UNACCEPTABLE. WHAT THE FUCK IS HAPPENING?"

321

68,658

Rob Bensinger ⏹️ · Nov 26, 2023 · 1:43 AM UTC

Rob Bensinger ⏹️

@robbensinger

26 Nov 2023

Replying to @astupple @slatestarcodex

I think you're just missing context on what happened here. Circa 2000, a set of transhumanist freedom-loving libertarians realized that we don't know on a technical level how to get good outcomes from AI, and if we don't figure it out in advance we're likely to destroy ourselves.

317

156,991

Rob Bensinger ⏹️ · Jun 6, 2024 · 10:31 PM UTC

Rob Bensinger ⏹️

@robbensinger

6 Jun 2024

My take on Leopold Aschenbrenner's new report: I think Leopold gets it right on a bunch of important counts. Three that I especially care about: 1 - Full AGI and ASI soon. (I think his arguments for this have a lot of holes, but he gets the basic point that superintelligence looks 5 or 15 years off rather than 50+.) 2 - This technology is an overwhelmingly huge deal, and if we play our cards wrong we're all dead. And 3 - Current developers are indeed fundamentally unserious about the core risks, and need to make IP security and closure a top priority. I especially appreciate that the report seems to get it when it comes to our basic strategic situation: it gets that we may only be a few years away from a truly world-threatening technology, and it speaks very candidly about the implications of this, rather than soft-pedaling it to the degree that public writings on this topic almost always do. I think that's a valuable contribution all on its own. Crucially, however, I think @leopoldasch gets the wrong answer on the question "is alignment tractable?". That is: OK, we're on track to build vastly smarter-than-human AI systems in the next decade or two. How realistic is it to think that we can control such systems? Leopold acknowledges that we currently only have guesswork and half-baked ideas on the technical side, that this field is extremely young, that many aspects of the problem look impossibly difficult (see attached image), and that there's a strong chance of this research operation getting us all killed. "To be clear, given the stakes, I think 'muddling through' is in some sense a terrible plan. But it might be all we’ve got." Controllable superintelligent AI is a far more speculative idea at this point than superintelligent AI itself. I think this report is drastically mischaracterizing the situation. ‘This is an awesome exciting technology, let's race to build it so we can reap the benefits and triumph over our enemies’ is an appealing narrative, but it requires the facts on the ground to shake out very differently than how the field's trajectory currently looks. The more normal outcome, if the field continues as it has been, is: if anyone builds it, everyone dies. This is not a national security issue of the form ‘exciting new tech that can give a country an economic or military advantage’; it's a national security issue of the form ‘we've found a way to build a doomsday device, and as soon as anyone starts building it the clock is ticking on how long before they make a fatal error and take themselves out, and take the rest of the world out with them’. Someday superintelligence could indeed become more than a doomsday device, but that's the sort of thing that looks like a realistic prospect if ASI is 50 or 150 years away and we fundamentally know what we're doing on a technical level — not if it's more like 5 or 15 years away, as Leopold and I agree. The field is not ready, and it's not going to suddenly become ready tomorrow. We need urgent and decisive action, but to indefinitely globally halt progress toward this technology that threatens our lives and our children's lives, not to accelerate ourselves straight off a cliff. Concretely, the kinds of steps we need to see ASAP from the USG are: - Spearhead an international alliance to prohibit the development of smarter-than-human AI until we’re in a radically different position. The three top-cited scientists in AI (Hinton, Bengio, and Sutskever) and the three leading labs (Anthropic, OpenAI, and DeepMind) have all publicly stated that this technology's trajectory poses a serious risk of causing human extinction (in the CAIS statement). It is absurd on its face to let any private company or nation unilaterally impose such a risk on the world; rather than twiddling our thumbs, we should act. - Insofar as some key stakeholders aren’t convinced that we need to shut this down at the international level immediately, a sane first step would be to restrict frontier AI development to a limited number of compute clusters, and place those clusters under a uniform monitoring regime to forbid catastrophically dangerous uses. Offer symmetrical treatment to signatory countries, and do not permit exceptions for any governments. The idea here isn’t to centralize AGI development at the national or international level, but rather to make it possible at all to shut down development at the international level once enough stakeholders recognize that moving forward would result in self-destruction. In advance of a decision to shut down, it may be that anyone is able to rent H100s from one of the few central clusters, and then freely set up a local instance of a free model and fine-tune it; but we retain the ability to change course, rather than just resigning ourselves to death in any scenario where ASI alignment isn’t feasible. Rapid action is called for, but it needs to be based on the realities of our situation, rather than trying to force AGI into the old playbook of far less dangerous technologies. The fact that we can build something doesn't mean that we ought to, nor does it mean that the international order is helpless to intervene.

291

68,277

Rob Bensinger ⏹️ · Jan 5, 2023 · 9:55 PM UTC

Rob Bensinger ⏹️

@robbensinger

5 Jan 2023

I side with Scott Aaronson in saying that the Chinese room thought experiment is sophistry. I don't know why it gets treated as anything else by any intellectual?

Fake Mario @ShakedDown

5 Jan 2023

Replying to @robbensinger @MikePFrank @SturnioloSimone @leventov @smatta1701 @heyorson @davidmanheim @Tris_Legomenon @AyeGill @beffjezos @cat_fro_devnull @bayeslord @RollinReisinger @xlr8harder @JeffLadish @LesaunH

I'm on the fence here. It's plausible that consciousness is an entirely separate phenomenon and that Chinese room/AI can't have it (but could be trained to convincingly sound like it does). Maybe assuming it can't is safer.

312

84,366

Rob Bensinger ⏹️ · Mar 22, 2023 · 2:16 AM UTC

Rob Bensinger ⏹️

@robbensinger

22 Mar 2023

The way ML developed post-2010 seems like more or less a worst-case scenario for humanity: - AI is opaque. - ML is impressive enough to build hype and shorten timelines, but not to help save the world at all. - We got there with brute force, not new insights into minds.

298

73,859

Rob Bensinger ⏹️ · Jun 20, 2025 · 11:22 PM UTC

Rob Bensinger ⏹️

@robbensinger

20 Jun 2025

294

65,169

Rob Bensinger ⏹️ · Oct 24, 2023 · 5:48 PM UTC

Rob Bensinger ⏹️

@robbensinger

24 Oct 2023

Seems like one of the more important facts about our civilization -- we live in the world where paying people is seen as taking advantage of them, while lying to people is seen as normal and OK. (In a surprisingly large number of cases.)

This tweet is unavailable

279

17,512

Rob Bensinger ⏹️ · Aug 10, 2025 · 9:30 PM UTC

Rob Bensinger ⏹️

@robbensinger

10 Aug 2025

It's funny how everyone thinks of the Prisoner's Dilemma as "bog-standard dilemma, you're a dick if you defect" and thinks of Newcomb's Problem as "insane paradox, one-boxing is crazy", even though they're literally the exact same problem and two-boxing is identical to defecting.

297

18,828

Rob Bensinger ⏹️ · May 31, 2023 · 2:26 AM UTC

Rob Bensinger ⏹️

@robbensinger

31 May 2023

I'm going to lost an ungodly, unrecoverable number of Bayes points if humanity somehow sticks the landing on this whole AI thing and it's going to be glorious

271

33,084

Rob Bensinger ⏹️ · Nov 20, 2023 · 6:05 AM UTC

Rob Bensinger ⏹️

@robbensinger

20 Nov 2023

With the benefit of hindsight, the last few days really look like "powerful people fought, and journalists were purely functioning as easily-controlled pawns in the power struggle". Which... for all the criticisms I have of journalists, is not something I remember seeing before.

266

40,664

Rob Bensinger ⏹️ · Apr 17, 2025 · 1:39 AM UTC

Rob Bensinger ⏹️

@robbensinger

17 Apr 2025

Right: Trade is negative sum, so let's crush everyone else into the dust so we can be the winners! Left: Trade is negative sum, so let's crush the winners in order to rescue everyone who's being ground into the dust! Reality: (doing its own unrelated thing over in the corner)

267

12,915

Rob Bensinger ⏹️ · Nov 23, 2023 · 8:30 PM UTC

Rob Bensinger ⏹️

@robbensinger

23 Nov 2023

We didn't learn "they can't fire him". We did learn that the organization's staff has enough faith in Sam that the staff won't go along with the board's wishes absent some good supporting arguments from the board. (Whether they'd have acceded to good arguments is untested.)

Toby Ord

@tobyordoxford

23 Nov 2023

The last few days exploded the myth that Sam Altman's incredible power faces any accountability. He tells us we shouldn't trust him, but we now know the board *can't* fire him. I think that's important.

266

32,315

Rob Bensinger ⏹️ · Oct 27, 2023 · 10:31 PM UTC

Rob Bensinger ⏹️

@robbensinger

27 Oct 2023

I implore everyone who agrees with me and everyone who disagrees with me: please have good discourse. Please say true and relevant things. Concede points from the other side when they're right. Focus on conversational cruxes. Fight the urge to zing.

251

24,440

Rob Bensinger ⏹️ · Jun 26, 2025 · 12:30 AM UTC

Rob Bensinger ⏹️

@robbensinger

26 Jun 2025

Here's your regular reminder that the three most cited living AI scientists (Bengio, Hinton, and Sutskever) signed a statement saying that "Mitigating the risk of extinction from AI should be a global priority". safe.ai/work/statement-on-ai…

259

14,633

Rob Bensinger ⏹️ · Sep 23, 2024 · 3:38 PM UTC

Rob Bensinger ⏹️

@robbensinger

23 Sep 2024

It was a vastly better allegory for AI than for climate change even when it was made. I'd go so far as to say that, treated as a climate change allegory, Don't Look Up is low-grade hysterical propaganda. As an analogy for AI, it actually makes sense and fits the dynamics.

Daniel Eth (yes, Eth is my actual last name)

@daniel_271828

23 Sep 2024

If Don’t Look Up was made today, it would be an allegory for AI instead of for climate

244

16,335

Rob Bensinger ⏹️ · Dec 17, 2021 · 3:06 AM UTC

Rob Bensinger ⏹️

@robbensinger

17 Dec 2021

What's the best way to get every journalist in the world to read this article? (Or failing that, get every journalist at the fifteen most widely-read serious English-speaking news outlets to read it.) Extra credit: make 'we've read this' common knowledge. astralcodexten.substack.com/…

245

Rob Bensinger ⏹️ · May 18, 2025 · 6:02 PM UTC

Rob Bensinger ⏹️

@robbensinger

18 May 2025

252

13,020

Rob Bensinger ⏹️ · Mar 11, 2025 · 5:35 PM UTC

Rob Bensinger ⏹️

@robbensinger

11 Mar 2025

"Don't do the naively evil-sounding thing" is a surprisingly powerful heuristic. Lying, cutting out your humanity, giving in to cowardice, allying with people-who-feel-like-baddies-to-you for power... you get a surprising amount of pragmatic mileage out of just Not Doing That.

244

7,303

Rob Bensinger ⏹️ · Aug 14, 2025 · 8:32 PM UTC

Rob Bensinger ⏹️

@robbensinger

14 Aug 2025

My most heterodox opinion: There's a real chance you could solve 10+% of the world's big problems if you just signal-boosted some novel discourse norms. E.g., a single prime-time debate game show that made r/changemymind or CFAR or AoA or AR memes go viral might upend the world.

251

15,741

Rob Bensinger ⏹️ · Apr 19, 2024 · 11:18 PM UTC

Rob Bensinger ⏹️

@robbensinger

19 Apr 2024

The thing I found most disturbing in the board debacle was that hundreds of OpenAI staff signed a letter that appears to treat the old-fashioned OpenAI view "OpenAI's mission of ensuring AGI benefits humanity matters more than our success as a company" as not just wrong, but beyond the pale. Prioritizing your company's existence over the survival and flourishing of humanity seems like an obviously crazy view to me, to the point that I suspect most of the people who signed the letter don't actually think that "OpenAI shuttering is consistent with OpenAI's mission" is a disqualifying or cancellable view to express within OpenAI. I assume the letter was drafted by people who weren't thinking very clearly and were under a lot of emotional pressure, and people mostly signed it because they agreed with other parts of the letter. I still find it pretty concerning that cascading miscommunications like this might cause it to become the case that a false consensus forms around "fuck OpenAI's mission, the org and its staff are what really matters" within the organization. At the very least, I would love to know that there hasn't been a chilling effect discouraging people from expressing the opinion "our original mission is still the priority", so that OpenAI feel comfortable debating this internally insofar as there's disagreement. I encourage @sama and the leadership at @OpenAI to clarify that the relevant part of the letter doesn't represent your values and intentions here, and I encourage OpenAI staff to publicly clarify their personal stance on this, especially if you signed the letter but don't endorse that part of it (or didn't interpret that part the way I'm interpreting it here). None of this strikes me as academic; OpenAI leadership knows that it's building something that could cause human extinction, and has said as much publicly on many occasions. Just say what your priorities are. (And if a lot of staff haven't gotten the memo about that, hell, remind them.)

This tweet is unavailable

233

114,179

Rob Bensinger ⏹️ · May 26, 2025 · 9:35 PM UTC

Rob Bensinger ⏹️

@robbensinger

26 May 2025

This book made me cry. I was actually legitimately shocked by how good it is. And the topic is completely insane and wildly consequential; like, it's hard to oversell. Comes Sept; preorder at ifanyonebuildsit.com/, and go down your list of friends and tell 'em to preorder too!

241

95,287

Rob Bensinger ⏹️ · Nov 21, 2022 · 1:56 AM UTC

Rob Bensinger ⏹️

@robbensinger

21 Nov 2022

When a thing is good or bad, it's usually not good/bad in every respect simultaneously. The impulse to make @elonmusk either good-on-all-dimensions or bad-on-all-dimensions simultaneously should be a red flag for motivated reasoning.

231

Rob Bensinger ⏹️ · Nov 21, 2023 · 1:51 AM UTC

Rob Bensinger ⏹️

@robbensinger

21 Nov 2023

Could someone from OpenAI explain to me why y'all have highlighted this quote as one of the main objections to the board's conduct? To my ear, if there's no world where it would be OK to shutter OpenAI, then it's not OK to shutter OpenAI even if the org is causing net harm and putting everyone at danger; to say that OpenAI ending is flat-out inconsistent with the mission seems like a super strong statement to me. Maybe that sounds like an unrealistic scenario to you, but to rule out "oh shit, are we the baddies?" in principle, no matter what happens in the future seems like a very strange move to me, for an org aspiring to build tech that all the leadership acknowledges has a real chance of killing every man, woman, and child on the planet. My guess is that you just have a different read of the board quote than I do. Could someone help me understand the perspective here? E.g., maybe I'm reading it as "we can't totally and unequivocally rule this out as an option, our duty is ultimately to humanity and not to ourselves", while you're reading this as "we are making the empirical assertion that currently there's a non-tiny chance OpenAI's existence is net-negative", and you think this is false and implies the board has a bunch of importantly false beliefs about the world. (Like, if you think AGI is a hundred years away and OpenAI's foreseeable impact is dominated by the obviously huge amount of immediate value it's provided to its current customers, then I could imagine you thinking the board is flatly wrong to assign non-negligible probability to "OpenAI is net harmful" at this point in time, if that's what the board is doing.)

217

41,523

Rob Bensinger ⏹️ · May 18, 2024 · 1:32 AM UTC

Rob Bensinger ⏹️

@robbensinger

18 May 2024

This level of reputation management seems congruent with the reporting that @sama tried to get @hlntnr kicked off the OpenAI board for publicly criticizing some of OpenAI's safety practices, and that this sparked the board conflict. I'm actually very sympathetic to orgs like OpenAI having unusual, heavy-duty secrecy policies. OpenAI leadership acknowledges that they're building toward tech that could kill literally all humans on Earth. I think they should stop and not build toward such tech; but conditional on "we're building this", it seems obvious to me that you need to be way stricter about leaking technical insights here than you would be about leaking normal IP, since leaking this stuff takes you from a world where doom is plausible to one where it's inevitable. ("One actor has a nuclear bomb" is scary; "everyone has a nuclear bomb in their backyard" is guaranteed death.) But "don't leak IP" seems very different from "never criticize the org for the rest of your life". In fact, the safety tradeoffs seem exactly reversed: there are huge safety risks to deceiving the world (and your employees!) about OpenAI's track record. It's a lot worse to do this when the stakes are huge. And letting people say positive things but not negative things is very obviously deceptive, particularly when you're forbidding those people from publicly acknowledging that they've made a commitment to only say positive things about you, so people are deceived even about the fact that their evidence is being filtered. I've previously been pretty agnostic about whether Ilya and the EAs on the OpenAI board were the Bad Guys in the board dispute — partly because I had a hard time seeing how it could possibly be reasonable to not share your side of the story for so long. If you're scared of sharing your side of the story, that's evidence that your side of the story has holes in it. ... But it's now a lot harder for me to update on things like that, because I now know that one of the core tools in the OpenAI toolbox is "demand that people not criticize you as a condition for the deal going through". And insofar as people know that this is an overriding priority for OpenAI leadership, and a tool they're very eager to deploy, they have an incentive to keep their cards close to their chest so as not to constrain the space of future deals that will be possible later. "Leak out your initial thoughts so people can understand what happened" now has a drastically higher opportunity cost. If OpenAI is going massively out of its way to make sure I only see positive employee accounts about OpenAI and no negative ones, then I obviously can't update on the positive ones. (Well, I guess I could trust positive reports from someone like @DKokotajlo67142 who's on the record as having turned down all of OpenAI's deals.) And I'm having a hard time seeing a perspective that would think it was a good idea to deceive me (and the rest of the world, and OpenAI's employees) in this way. 🫤

Kelsey Piper

@KelseyTuoc

17 May 2024

Replying to @KelseyTuoc

Equity is part of negotiated compensation; this is shares (worth a lot of $$) that the employees already earned over their tenure at OpenAI. And suddenly they're faced with a decision on a tight deadline: agree to a legally binding promise to never criticize OpenAI, or lose it.

212

28,022

Rob Bensinger ⏹️ · Oct 24, 2024 · 7:42 PM UTC

Rob Bensinger ⏹️

@robbensinger

24 Oct 2024

"Humans do this weird thing called 'empathy' where they basically have a seizure and start thinking they're a totally different person. Some humans can even have these seizures and think they're a dog, or an inanimate object. And they're addicted to these seizures and think they're a cornerstone of their civilization and purpose in life, even though they're constantly giving up food and sex and power opportunities as a result of the seizures! It would be fascinating if it weren't so sad."

Rob Bensinger ⏹️

@robbensinger

24 Oct 2024

Replying to @robertwiblin

What would a typical insect's version of this fact list look like if they were writing it about humans or mammals?

211

23,497

Rob Bensinger ⏹️ · Jun 17, 2022 · 9:12 AM UTC

Rob Bensinger ⏹️

@robbensinger

17 Jun 2022

A lot of the relative placements on that AGI political compass meme seemed very wrong to me, so here's one that does match my current impressions: (My incredibly vague, amazingly low-confidence, June 17 2022 impressions.)

211

Rob Bensinger ⏹️ · Apr 4, 2023 · 7:58 PM UTC

Rob Bensinger ⏹️

@robbensinger

4 Apr 2023

I've been citing lesswrong.com/posts/uMQ3cqWD… to explain why the situation with AI looks doomy to me. But that post is relatively long, and emphasizes specific open technical problems over "the basics". Here are 10 things I'd focus on if I were giving "the basics" on why I'm worried:

196

75,651

Rob Bensinger ⏹️ · Feb 10, 2023 · 12:05 AM UTC

Rob Bensinger ⏹️

@robbensinger

10 Feb 2023

I'm not a big fan of the "takeoff" analogy for AGI. In real life, AGI doesn't need to "start on the ground". You can just figure out how to do AGI and find that the easy way to do AGI immediately gets you a model that's far smarter than any human. Less "takeoff", more "teleport".

194

50,154

Rob Bensinger ⏹️ · May 24, 2025 · 10:07 PM UTC

Rob Bensinger ⏹️

@robbensinger

24 May 2025

The advice is fine, but the missing mood is a doozy. "Be sure to practice self-care, and don't forget to take time for you! 😇" "... Uh, OK, but are you sure you heard me about the whole AI-killing-everyone-we-love thing? Just double-checking that you heard me."

193

13,717

Rob Bensinger ⏹️ · Jan 13, 2023 · 7:43 AM UTC

Rob Bensinger ⏹️

@robbensinger

13 Jan 2023

Conservatives and progressives are gradually doing the equivalent of brain-damaging each other over time, by associating various cognitive steps and contents with a particular political coalition and therefore causing the rival coalition to be less able to think certain thoughts.

187

12,643

Rob Bensinger ⏹️ · Feb 28, 2025 · 3:54 PM UTC

Rob Bensinger ⏹️

@robbensinger

28 Feb 2025

The more out-of-fashion it becomes to spend comically small amounts of money to stop infants from contracting and dying of HIV, the more I come around to thinking it really is just dead-obvious garden-variety effective altruism that the world needs right now.

Kelsey Piper

@KelseyTuoc

25 Feb 2025

Some people have asked me why private donations can't step in to replace PEPFAR. But even organizations that have successfully raised private money can't figure out where to order the drugs amidst the supply chains shattered by USAID being 'fed to the wood chipper':

192

6,036

Rob Bensinger ⏹️ · Dec 10, 2022 · 11:17 PM UTC

Rob Bensinger ⏹️

@robbensinger

10 Dec 2022

My thoughts on EA optics:

187

Rob Bensinger ⏹️ · Apr 14, 2025 · 10:29 PM UTC

Rob Bensinger ⏹️

@robbensinger

14 Apr 2025

So: the administration is claiming the authority to disappear any human being in the US, citizen or not, to overseas prisons? Permanently, without due process, and for literally any reason (since it explicitly includes cases where they fucked up)? Am I getting that right?

189

5,099

Rob Bensinger ⏹️ · Apr 20, 2025 · 7:28 PM UTC

Rob Bensinger ⏹️

@robbensinger

20 Apr 2025

Yes, now that "11 years away" has been reclassified as a "slow AI timeline".

David Manheim

@davidmanheim

20 Apr 2025

AI skeptics are right to be skeptical of very short AGI timelines. There will likely still be things AI systems can't (yet) do for several years. AI alarmists are right to be alarmed - humanity doesn't react to crises quickly enough for even slow AI timelines to be safe.

186

16,527

Rob Bensinger ⏹️ · Mar 5, 2025 · 11:45 PM UTC

Rob Bensinger ⏹️

@robbensinger

5 Mar 2025

OK, but AI doomerism is true. (In the sense that we're super likely to kill ourselves if we manage to build superintelligent AI in the next decade, not in the sense that there's nothing we can do about the situation.)

181

11,177

Rob Bensinger ⏹️ · May 24, 2025 · 6:31 PM UTC

Rob Bensinger ⏹️

@robbensinger

24 May 2025

Replying to @Aella_Girl

I think all three of these are good things: "giving something without expecting anything in return", "giving something while expecting a specific thing in return", and "giving something with the expectation of this vaguely influencing a social ledger of back-and-forth favors".

184

6,937

Rob Bensinger ⏹️ · Dec 2, 2023 · 10:09 AM UTC

Rob Bensinger ⏹️

@robbensinger

2 Dec 2023

Is this @ylecun's view?: 1. The probability of AGI being developed by a method other than my favored one is negligible. 2. The probability of my favored approach being hard to align is negligible. 3. The probability of early AGIs being cheap to run or very smart is negligible.

179

27,583

Rob Bensinger ⏹️ · Nov 28, 2023 · 7:15 PM UTC

Rob Bensinger ⏹️

@robbensinger

28 Nov 2023

Or, in simpler terms:

megabase @the_megabase

26 Nov 2023

political compass of effective altruist critiques (long version)

180

20,775

Rob Bensinger ⏹️ · Apr 20, 2024 · 11:57 PM UTC

Rob Bensinger ⏹️

@robbensinger

20 Apr 2024

The impression I'm getting from some OpenAI staff is that their view is something like: "OpenAI's 1200+ employees are, pretty much to a man, extremely committed to the nonprofit mission. Effectively all of us take existential risk from AI seriously, and would even be willing to undergo a lot of personal turmoil and financial loss if that's what it took to ensure OpenAI's mission succeeds. (E.g., if OpenAI had to shut down, reduce team size, or scale down its ambitions in response to safety concerns.) "This is true to such an extreme degree that I have a hard time imagining it being non-obvious to anyone who's been in the x-risk space for more than 30 seconds. There's literally no point in us reiterating our commitment to existential risk reduction for the thousandth time. You claim that the staff open letter was ambiguous, but I don't think it's ambiguous at all; I feel like you're trying to tar our reputation and get us to jump through hoops for no reason, when it should be obvious to everyone that OpenAI staff and leadership have OpenAI's social impact as their highest priority." To which I say: I've never worked at OpenAI. The stuff that's obvious to you isn't obvious to me. I'm having to piece together the situation from talking to OpenAI staff, and some of them are saying stuff like the above, and others are saying the opposite. Which leaves me pretty fuzzy about what the actual situation is, and pretty keen to hear something more definitive from OpenAI leadership, or at least to hear a slightly longer account that helps me see how to reconcile the conflicting descriptions. I flatly disagree with "the staff open letter wasn't ambiguous", and I strongly suspect that this view is coming from an illusion-of-transparency sort of place: empirically, people often have a really hard time seeing how a sentence can be interpreted differently once they have their own interpretation in mind. But also, human nature being what it is, it is not unheard-of for people to endorse a high-level nice-sounding claim, while balking at some of the less-nice-sounding logical implications of that claim. @OpenAI tweeting out "We affirm that OpenAI wants the best for everyone" is easy mode. OpenAI tweeting out "We affirm that shutting down OpenAI is consistent with the nonprofit mission, and if we ever think shutting down is the best way to serve that mission, we'll do it in a heartbeat" is genuinely less easy. And actually following through is harder still. If you're worried that investors and partners will be marginally less interested in OpenAI if you issue a press statement like that, well: I think that creates an even stronger case for being loud about this. Because you want to be honest and up-front with your investors and partners, but also because to the extent your worry is justified, those investors and partners are creating an incentive pressure for you to back away from your mission later and prioritize near-term profits when push comes to shove. Or to come up with reasons, as needed, for why the seemingly profit-maximizing option is really the long-term-human-welfare-maximizing option after all. If you're correct that OpenAI's staff is currently super invested in the mission, that's awesome! But I expect OpenAI to grow in the future, and to accumulate more investors and more partners. If you're at all worried about mission drift or misaligned incentives in the future, never mind how awesome OpenAI is today, then I think you should be jumping on opportunities like this to clarify that you're actually serious about this stuff, and that you aren't going to say different things to different audiences as convenient, when the issue is this damned central.

Rob Bensinger ⏹️

@robbensinger

19 Apr 2024

173

44,928

Rob Bensinger ⏹️ · Jan 24, 2025 · 5:14 AM UTC

Rob Bensinger ⏹️

@robbensinger

24 Jan 2025

No, I don't assume that Anthropic or OpenAI's timelines are correct, or even honest. It literally doesn't fucking matter, because we're going to be having the same conversation in eight years instead if it's eight years away.

170

11,255

Rob Bensinger ⏹️ · Jan 30, 2023 · 2:13 AM UTC

Rob Bensinger ⏹️

@robbensinger

30 Jan 2023

Today's mood: everyone less autistic than me is a lying unprincipled PR robot, everyone more autistic than me is a goofball who thinks their sensory sensitivities and social preferences are important moral principles that keep communities from collapsing into ruin

174

7,544

Rob Bensinger ⏹️ · Dec 5, 2022 · 6:25 PM UTC

Rob Bensinger ⏹️

@robbensinger

5 Dec 2022

Ran it six times with the same prompt, got "men are taller than women" 6/6 times. Good to check whether a prompt gets you the same result before retweeting, since otherwise Twitter will amplify the most surprising outlier ChatGPT behavior, rather than its usual behavior.

Matthew Yglesias

@mattyglesias

5 Dec 2022

ChatGPT is so averse to stereotypes and generalization that it's reluctant to say men are taller than women.

171

Rob Bensinger ⏹️ · Oct 20, 2022 · 4:02 PM UTC

Rob Bensinger ⏹️

@robbensinger

20 Oct 2022

Reminder: Hume's is-ought distinction is an "is", so it has no "ought" implications.

161

Rob Bensinger ⏹️ · Sep 23, 2024 · 6:37 PM UTC

Rob Bensinger ⏹️

@robbensinger

23 Sep 2024

This video's a model of what AI xrisk protests should aim for, IMO: Grassroots-y. Science-y. Reasoned arguments, not low-content outrage. Good balance of "substantive, but easy to understand". Forceful, but not shrill. Explicit focus on smarter-than-human AI and extinction risk.

You’re unable to view this Post because this account owner limits who can view their Posts.

166

19,655

Rob Bensinger ⏹️ · Aug 11, 2025 · 1:10 AM UTC

Rob Bensinger ⏹️

@robbensinger

11 Aug 2025

Replying to @Aella_Girl

I think part of why people are getting tripped up by this is that they're conflating "an even-handed source" with "a source that's convinced both sides are similarly bad". Your processes can be relatively transparent, checkable, and reasonable, even if your conclusion is strong.

172

14,881

Rob Bensinger ⏹️ · May 21, 2024 · 1:09 AM UTC

Rob Bensinger ⏹️

@robbensinger

21 May 2024

Thoughts on OpenAI from @ozziegooen:

166

26,366

Rob Bensinger ⏹️ · May 29, 2022 · 10:16 PM UTC

Rob Bensinger ⏹️

@robbensinger

29 May 2022

The Four Non-Blind Men and the Elephant: A Fable Once, while traversing a great wood, four non-blind men happened upon an elephant. All of them said "That's an elephant", because it was an elephant.

164

Rob Bensinger ⏹️ · Jul 4, 2020 · 11:12 PM UTC

Rob Bensinger ⏹️

@robbensinger

4 Jul 2020

.@TwitterSupport, please unban @lukeprog. Nobody knows why he is banned, including Luke, and we are very confused. If anything, it would make more sense to ban everyone else and just have Twitter consist of @lukeprog going forward.

168

Rob Bensinger ⏹️ · Nov 22, 2023 · 6:04 PM UTC

Rob Bensinger ⏹️

@robbensinger

22 Nov 2023

Last I checked, I the term for that kind of e/acc is "doomer"

Rumtin

@rumtin

22 Nov 2023

Is there an "e/acc for everything but nukes, bioweapons, AI-enabled warfare and authoritarian surveillance"? Asking for a friend.

165

27,971

Rob Bensinger ⏹️ · Aug 20, 2025 · 6:33 PM UTC

Rob Bensinger ⏹️

@robbensinger

20 Aug 2025

This seems like a mental health crisis to me? Like, this seems less "climate change is a huge problem" and more "climate change is unboundedly bad". If it's unboundedly bad, then your personal contribution to climate change is also unboundedly bad, and there's no atoning for having been born; there's just doing as much as you can muster to try to atone and pay off at least a little of your lifelong moral debt. If it's unboundedly bad, then there can be no talk of trading off benefits for costs. It's not "to justify environmental impacts of size X, this needs to have a benefit of size Y". It's "make yourself as small as possible". There's no idea here that your life is good, and that it might be worthwhile to spend nonzero natural resources to have a good life. Wanting to be alive is just greedy and selfish (though understandable, because we aren't saints). It's a really strange perspective. A lot of this stuff could make sense if the numbers were wildly different. If you're trapped on a space station for fifty years with limited resources, it really might be a bad idea to have a kid. People have a really hard time with scope. The sheer size of the planet (to say nothing of other planets) is hard to intuit. People just hear "there's a crazy number of people on the planet, and the planet itself is crazy big", and they can totally buy that maybe those two crazy numbers are similar in size and we're butting up against the limits of our planet's capacity. I'd love to have an 5-to-15-page resource that just unpolemically explains the situation we're actually in: - Give concrete examples and visualizations to give a general order-of-magnitude sense of how many people there actually are, how population growth is changing over time, resource availability, impact of tech progress, etc. E.g., scale things down by 100 million times to make it easier to visualize: "Imagine there are 80 people in the world, and the oceans contain the equivalent of somewhat more than 3 billion olympic swimming pools of water..." - Give a range of plausible climate change outcomes (e.g., 10th-percentile bad, 50th-percentile bad, 90th-percentile bad), and make concrete comparisons. Talk about the moral issues from some people being impacted by climate change more than others, but give a quantitative sense of how bad this is compared to other bad things, and give a sense of how much you'd need to do to offset your impact on other people. - Give concrete examples and visualizations to convey a sense of how environmentally good or bad various everyday activities actually are. Walk through a typical reader's day, and a typical reader's life, and note the scope of different interventions. Talk a bit about professional specialization and the multipliers you can have by donating to full-time specialists and projects actively working on a cause, and the scale of this impact vs. focusing on your personal day-to-day consumption. I dunno, seems like it could save an awful lot of souls to just have a single good-quality document that does this.

Andy Masley

@AndyMasley

13 Aug 2025

Folks, I ran the numbers on the UK government's recommendation to delete old photos and emails to save water. Link below

165

11,244

Rob Bensinger ⏹️ · Dec 25, 2024 · 10:26 PM UTC

Rob Bensinger ⏹️

@robbensinger

25 Dec 2024

rage, rage against the tiling of the lightcone

156

8,638

Rob Bensinger ⏹️ · Nov 12, 2023 · 11:46 PM UTC

Rob Bensinger ⏹️

@robbensinger

12 Nov 2023

I disagree with Bostrom's 'society isn't yet worried enough, but I now worry there's a strong chance we'll overreact'. I think underreaction is still hugely more likely, and hugely more costly. But I'm extremely glad x-risk people are the sorts to loudly voice worries like that.

160

48,033

Rob Bensinger ⏹️ · Jan 5, 2024 · 11:16 PM UTC

Rob Bensinger ⏹️

@robbensinger

5 Jan 2024

Regular reminder: in a true prisoner's dilemma, "defect if you can get away with it cost-free" is a crucially important policy. It's actively bad, unvirtuous, unethical, etc. to make a policy of never defecting in those cases. (Imagine meeting a serial killer and aiding them in their murders, to no benefit, because you prize "be cooperative with the person standing in front of you" above all else.) There are many cases where you should cooperate (when your action isn't fully independent of the other player's, or when you're in an iterated dilemma and expect future punishment to outweigh the immediate reward of defection). But it should always seem like a genuinely bad thing to be forced to resort to the (I Cooperate, They Cooperate) part of the matrix rather than the (I Defect, They Cooperate) part. This is more fundamental than understanding the reasons for cooperation, because it's about understanding what the Prisoner's Dilemma even is, i.e., understanding what the payoff structure is. If you don't already viscerally hate the idea of throwing away an (I Defect, They Cooperate) outcome for no benefit, then you aren't ready for the more advanced lessons that are meant to be exceptions to that rule. Instead, you should first try reframing the PD in various ways until it's clearer how the payoffs translate into your values. (E.g., the payoffs should already incorporate values like kindness, fairness, and integrity, if those are things you care about.) If you're in a dilemma where your moral values dominate the decision, then (I Defect, They Cooperate) should be constructed in such a way that it's less fair than (I Cooperate, They Cooperate); or constructed in such a way that it's less kind, or lower-integrity; etc. Don't be misled by the suggestive English-language labels "defect" and "cooperate"!

150

34,916

Rob Bensinger ⏹️ · Apr 7, 2023 · 8:29 PM UTC

Rob Bensinger ⏹️

@robbensinger

7 Apr 2023

Eliezer Yudkowsky's response to "Can someone please explain how people get such highly confident estimates of near-certain doom from AI?":

156

152,285

Rob Bensinger ⏹️ · Nov 26, 2023 · 1:53 AM UTC

Rob Bensinger ⏹️

@robbensinger

26 Nov 2023

Replying to @robbensinger @astupple @slatestarcodex

I realize that there are a lot of statists in the world, so it makes sense to have that hypothesis queued up. But jesus fucking christ, have you ever misunderstood what's happening in this particular weird case.

150

2,928

Rob Bensinger ⏹️ · Nov 26, 2023 · 1:44 AM UTC

Rob Bensinger ⏹️

@robbensinger

26 Nov 2023

Replying to @robbensinger @astupple @slatestarcodex

The freedom-loving transhumanists then spent twenty years working hard to try and better understand this problem, spin up a research field about it, and generally do everything possible except alert the governments of the world about it.

149

8,580

Rob Bensinger ⏹️ · Jan 24, 2025 · 10:30 PM UTC

Rob Bensinger ⏹️

@robbensinger

24 Jan 2025

"We can't let China beat us at Russian roulette!" -@AlexAlarga

159

7,841

Rob Bensinger ⏹️ · Feb 18, 2023 · 10:40 AM UTC

Rob Bensinger ⏹️

@robbensinger

18 Feb 2023

"AGI is scary because argmaxing is scary" 🢡 "AGI is scary because what if an AGI ran a paperclip factory!" "MIRI does math" 🢡 "MIRI does GOFAI" "There's no fire alarm for AGI" 🢡 "Let's start calling every new advance in AI a fire alarm!" WHY DOES MEMETICS WORK THIS WAY

144

34,735

Rob Bensinger ⏹️ · Mar 5, 2025 · 11:57 PM UTC

Rob Bensinger ⏹️

@robbensinger

5 Mar 2025

"Building a new intelligent species that's vastly smarter than humans is a massively dangerous thing to do" is not a niche or weird position, and "we're likely to actually build a thing like that in the next decade" isn't a niche position anymore either.

147

8,328

Rob Bensinger ⏹️ · Jun 27, 2022 · 1:14 AM UTC

Rob Bensinger ⏹️

@robbensinger

27 Jun 2022

My updated guesses at people's views:

141

Rob Bensinger ⏹️ · May 20, 2025 · 4:13 AM UTC

Rob Bensinger ⏹️

@robbensinger

20 May 2025

This is concerning for more than one reason.

AI Notkilleveryoneism Memes ⏸️

@AISafetyMemes

19 May 2025

Wow

139

8,017

Rob Bensinger ⏹️ · Dec 15, 2022 · 1:13 AM UTC

Rob Bensinger ⏹️

@robbensinger

15 Dec 2022

Indeed, often thinking of a question as "just a mundane question I can answer by thinking about it the same as any other question" *is* the key unusual move needed to answer the question.

138

Rob Bensinger ⏹️ · Mar 11, 2023 · 6:58 AM UTC

Rob Bensinger ⏹️

@robbensinger

11 Mar 2023

Replying to @Aella_Girl

@eigenrobot obversed stupidity is not intelligence

135

35,539

Rob Bensinger ⏹️ · Dec 15, 2022 · 1:07 AM UTC

Rob Bensinger ⏹️

@robbensinger

15 Dec 2022

Rather a lot of work is done by just following thoughts through to their conclusion, consistently applying "easy" reasoning methods, etc. Many unusual and important conclusions can be reached without your being sparklingly creative or anything.

137

Rob Bensinger ⏹️ · Dec 14, 2022 · 12:44 AM UTC

Rob Bensinger ⏹️

@robbensinger

14 Dec 2022

Thread for examples of alignment research MIRI has said relatively positive stuff about: ("Relatively" because our overall view of the field is that not much progress has been made, and it's not clear how we can change that going forward. But there's still better vs. worse.)

135

Rob Bensinger ⏹️ · May 9, 2025 · 5:51 AM UTC

Rob Bensinger ⏹️

@robbensinger

9 May 2025

People aren't talking enough about how AI companies are trying to build a new type of AI that experts say has a double-digit chance of literally killing 5% of all children in the US.

137

10,857

Rob Bensinger ⏹️ · Nov 26, 2023 · 7:37 PM UTC

Rob Bensinger ⏹️

@robbensinger

26 Nov 2023

Seems important to mention, given Eliezer Yudkowsky's note about @sama never reaching out to him to talk over the years: Nate Soares and Sam have had a few conversations over the years. Specifically: Nate reached out to Sam in mid-2015 to talk about Sam's plans to create OpenAI; Sam initiated a chat about "AI safety stuff" in late 2021; Nate initiated a second chat on the same topic in early 2022; and Sam initiated a call in mid-2022 (to see if Nate was interested in joining OpenAI).

128

38,862

Rob Bensinger ⏹️ · Jan 4, 2024 · 12:11 AM UTC

Rob Bensinger ⏹️

@robbensinger

4 Jan 2024

Just had a 30m conversation about combinatorics, probability theory, and the gambler's fallacy, using as an example 'What if our hotel room number happens to be 620, the same as my sister's apartment number?'. Walked across the street, checked in, found our number was 620. 😲

126

6,756

Rob Bensinger ⏹️ · Dec 19, 2022 · 10:10 PM UTC

Rob Bensinger ⏹️

@robbensinger

19 Dec 2022

A few small edits:

133

24,427

Rob Bensinger ⏹️ · Nov 26, 2023 · 1:46 AM UTC

Rob Bensinger ⏹️

@robbensinger

26 Nov 2023

Replying to @robbensinger @astupple @slatestarcodex

Because the arguments for 'government intervention will probably just make this problem even worse' were very obvious. Bureaucrats won't understand the problem. They can't ban the dangerous practices alone, because they don't know what's dangerous.

126

5,261

Rob Bensinger ⏹️ · Jul 11, 2018 · 5:42 PM UTC

Rob Bensinger ⏹️

@robbensinger

11 Jul 2018

Elizabeth Morningstar: "rationality is like a martial art in that if you find yourself in a situation where you could use it, you must always try running away first"

128

Rob Bensinger ⏹️ · Apr 25, 2023 · 8:06 PM UTC

Rob Bensinger ⏹️

@robbensinger

25 Apr 2023

I'd guess that the release of the movie "Don't Look Up" is a not-unimportant reason people are responding more sanely to AGI right now. (Though not the only reason.) I'd call "Don't Look Up" a rationality intervention / metacognition intervention / sociology intervention.

122

11,023

Rob Bensinger ⏹️ · Aug 12, 2022 · 11:51 PM UTC

Rob Bensinger ⏹️

@robbensinger

12 Aug 2022

IMO, EAs should be proud of achievements like the TIME cover story and the UN report. It's always hard to predict the net effect of high-profile events like this, but I'd guess they're positive. And these ideas are genuinely worthy ideas, deserving of serious discussion.

125

Rob Bensinger ⏹️ · Apr 18, 2023 · 12:24 AM UTC

Rob Bensinger ⏹️

@robbensinger

18 Apr 2023

Replying to @perrymetzger @ATabarrok

"Hidden Complexity of Wishes" isn't arguing that a superintelligence would lack common sense, or that it would be completely unable to understand natural language. It's arguing that loading the right *motivations* into the AI is a lot harder than loading the right understanding.

125

10,193

Rob Bensinger ⏹️ · May 17, 2025 · 7:38 PM UTC

Rob Bensinger ⏹️

@robbensinger

17 May 2025

Proposed new title: "If Anyone Builds It, Everyone Dies: Why Superintelligent AI Would Kill Us All: No Really We Actually Mean It, This Is Not Hyperbole, No That Is Literally Actually What We're Saying" Hell, that's pretty close to what the book website says:

Eliezer Yudkowsky ⏹️

@ESYudkowsky

17 May 2025

Replying to @xlr8harder @NathanpmYoung

Well, that indeed might be the problem, then. I'm glad you got to hear the book title and find out what my warning actually was.

125

10,996

Rob Bensinger ⏹️ · Jun 2, 2024 · 11:18 PM UTC

Rob Bensinger ⏹️

@robbensinger

2 Jun 2024

Seems absurdly overconfident. E.g., seems to entail a less than 1 in 100,000,000 chance of something else wiping us out (or permanently derailing civilization) first, like an act of bioterrorism?

Dr. Roman Yampolskiy

@romanyam

12 Mar 2024

I am at 99.999999% pauseai.info/pdoom

120

7,181

Rob Bensinger ⏹️ · Jul 27, 2022 · 2:41 AM UTC

Rob Bensinger ⏹️

@robbensinger

27 Jul 2022

"I will sound crazy on Twitter so that my colleagues can sound crazy in private so that their colleagues can build a safer AI so that our children can live in the glorious transhuman future"

118

Rob Bensinger ⏹️ · May 4, 2024 · 4:45 AM UTC

Rob Bensinger ⏹️

@robbensinger

4 May 2024

You don't need to throw out probability theory in order to think 'falsifiability is what separates science from pseudoscience'. But Popper separately does throw out probability theory. And a HUGE part of why he likes 'falsifiability' as a criterion is that it makes science sound more deductive and less probabilistic. ('Falsification' is black and white: either it happens and the theory is Proved False, or it doesn't happen and we have no Proof. Talking about 'supporting and opposing evidence' or 'observations that are more likely under one hypothesis than under another' is infinitely closer to what scientists actually do, and infinitely closer to the real mechanisms behind good epistemic practice; but it's probabilistic and inductive, and for Popper that makes it a non-starter.)

116

19,398

Rob Bensinger ⏹️ · Nov 21, 2022 · 1:38 AM UTC

Rob Bensinger ⏹️

@robbensinger

21 Nov 2022

Replying to @peterhartree @ESYudkowsky @MIRIBerkeley

It's either true (given the information available to us) that humanity's odds of surviving AGI are very low, or it's false -- our ability to come up with mean-sounding labels ("luddite despair"), or to object on vibes grounds (doesn't sound optimistic enough) doesn't change that.

118

Rob Bensinger ⏹️ · Aug 13, 2022 · 2:23 AM UTC

Rob Bensinger ⏹️

@robbensinger

13 Aug 2022

Replying to @ohabryka @RichardMCNgo

My mental model of how EA came about looks something like this. What would you change about this picture? (Same Q to others who see this.) (Can include 'this is suboptimally coarse-grained, the fine-grained version should note things like X and Y'.)

122

Rob Bensinger ⏹️ · Aug 1, 2025 · 4:15 PM UTC

Rob Bensinger ⏹️

@robbensinger

1 Aug 2025