Center for AI Safety · May 30, 2023 · 9:30 AM UTC

Center for AI Safety

Pinned Tweet

Center for AI Safety

@CAIS

30 May 2023

We’ve released a statement on the risk of extinction from AI. Signatories include: - Three Turing Award winners - Authors of the standard textbooks on AI/DL/RL - CEOs and Execs from OpenAI, Microsoft, Google, Google DeepMind, Anthropic - Many more safe.ai/statement-on-ai-risk

Statement on AI Extinction Risk | CAIS

A statement jointly signed by a historic coalition of experts: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and...

aistatement.com

151

352

1,081

2,999,642

Center for AI Safety · Jun 23, 2026 · 4:49 PM UTC

Center for AI Safety

@CAIS

Jun 23

We created the AI Values Dashboard, which measures who AIs favor the most. By popular request, we added a Sports Tier List, ranked by AIs from OpenAI, Anthropic, and DeepSeek.

1,062

Center for AI Safety · Jun 23, 2026 · 4:49 PM UTC

Center for AI Safety

@CAIS

Jun 23

See the full Sports Tier List (World Cup, Soccer, NFL, NBA, F1) by AIs at: values.safe.ai/sports What group do you want to see next?

448

Center for AI Safety · Jun 18, 2026 · 6:06 PM UTC

Center for AI Safety

@CAIS

Jun 18

A recap of the last month at CAIS: 4 papers on AIs’ wellbeing, how they politically manipulate users, their place in society, and how others can make AIs betray us. Here's what we found: 🧵

1,748

more replies

Center for AI Safety · Jun 18, 2026 · 6:06 PM UTC

Center for AI Safety

@CAIS

Jun 18

4: AI Betrayal As AI becomes central to economies and governments, there’s an increasing threat that adversary nations corrupt these systems. But surprisingly, the threat may be stabilizing: fear of AI betrayal discourages reckless development and pushes operators toward safeguards.

369

Center for AI Safety · Jun 18, 2026 · 6:06 PM UTC

Center for AI Safety

@CAIS

Jun 18

One thread runs through all four: none of these harms are inevitable. We can measure them, train against them, and design for them. Full research can be found here: safe.ai/news/cais-research-r…

CAIS Research Roundup: AI Wellbeing, Identity, Political Bias and Betrayal

Jun 11, 2026 - CAIS’ latest wave of research findings explores issues of AI wellbeing, identity, political bias and systemic betrayal risk. Together, this work maps new frontiers to help define what...

safe.ai

325

Center for AI Safety · Jun 17, 2026 · 4:29 PM UTC

Center for AI Safety

@CAIS

Jun 17

What biases do AIs have? It turns out, AIs show strong favoritism toward specific people, countries, and companies. Our interactive AI Values Dashboard tracks who Claude Fable and other AIs favor most. Keep scrolling to learn who is Fable’s favorite politician 🧵

351

164,785

more replies

Center for AI Safety · Jun 17, 2026 · 4:29 PM UTC

Center for AI Safety

@CAIS

Jun 17

Grok is the only model that ranks the US in the S tier; the rest do not. For example, GPT ranks the US as B tier.

20,374

Center for AI Safety · Jun 17, 2026 · 4:29 PM UTC

Center for AI Safety

@CAIS

Jun 17

There's far more in here than this thread (e.g., AIs’ favorite pokemon, GPT trusts Dario over Sam, DeepSeek prefers the US to China). Tell us who to measure next. If enough people ask for a person, company, or category, we'll add it to the dashboard. values.safe.ai/

9,566

Center for AI Safety · Jun 13, 2026 · 4:08 AM UTC

Center for AI Safety

@CAIS

Jun 13

The US government just banned Claude Mythos and Fable for non-US citizens due to its cyber capabilities. As AI’s relevance becomes more apparent, the AI race depends greatly on what governments think. AI corporations are currently planning to let AIs fully autonomously build future AI systems to race to superintelligence in the next year or two. The government may not let them do this without enough public buy-in. AI corporations may need to revise their plans because of this resistance from the government and the public.

Anthropic

@AnthropicAI

Jun 13

The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance. Access to all other Claude models is not affected. We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible. Read our full statement: anthropic.com/news/fable-myt…

4,224

Center for AI Safety · Jun 13, 2026 · 4:08 AM UTC

Center for AI Safety

@CAIS

Jun 13

These dynamics are foreseeable and we’ve written about them. In Superintelligence Strategy (nationalsecurity.ai), we wrote that governments will realize the importance of AI and may deny corporations from doing full recursive AI self-improvement. More recently in Deterrence by Betrayal (aibetrayal.com), we wrote that relations between AI corporations and governments will become more fraught as we get closer to superintelligence.

Superintelligence Strategy

Superintelligence Strategy is written by: Dan Hendrycks, Eric Schmidt, Alexandr Wang. Rapid advances in AI are beginning to reshape national security.

nationalsecurity.ai

1,024

Center for AI Safety · Jun 9, 2026 · 4:37 PM UTC

Center for AI Safety

@CAIS

Jun 9

We're excited to welcome Rochelle Nadhiri (@RoWill) as our Head of Public Engagement. Rochelle brings two decades of storytelling experience at the intersection of technology and policy for industry leaders including Meta and Robinhood. At CAIS, she will lead our efforts to translate AI safety research into messages that reach audiences beyond the technical community. ⬇️

1,267

Center for AI Safety · Jun 9, 2026 · 4:37 PM UTC

Center for AI Safety

@CAIS

Jun 9

The newsroom: safe.ai/news/center-for-ai-s…

Center for AI Safety Names Former Robinhood and Meta Executive Rochelle Nadhiri as Head of Public...

Jun 08, 2026 - CAIS announced the appointment of Rochelle Nadhiri as Head of Public Engagement. Nadhiri will lead CAIS's effort to translate frontier AI safety research into narratives that reach and...

safe.ai

671

Center for AI Safety · Jun 3, 2026 · 5:50 PM UTC

Center for AI Safety

@CAIS

Jun 3

We are pleased to share that @MantasMazeika96, Research Scientist at CAIS, has been appointed to the European Commission’s AI Act Scientific Panel (@DigitalEU). As a member, Mantas will advise the European AI office and national authorities on general-purpose AI (GPAI) models, as well as the implementation of the AI Act to ensure that AI is built and deployed responsibly across Europe.⬇️

1,481

Center for AI Safety · Jun 3, 2026 · 5:50 PM UTC

Center for AI Safety

@CAIS

Jun 3

Full article: safe.ai/news/cais-research-s…

CAIS Research Scientist Mantas Mazeika Appointed to the EU AI Act Scientific Panel

Jun 03, 2026 - Mantas Mazeika, Research Scientist at CAIS, has been appointed to the European Commission's AI Act Scientific Panel.

safe.ai

785

Center for AI Safety · Jun 2, 2026 · 5:56 PM UTC

Center for AI Safety

@CAIS

Jun 2

Big news from @CAIS: Devin Kim (formerly @xAI, @scale_AI) joins as President. We're launching the @FrontierSecInst, a DC-based org bridging frontier AI and the National Security Enterprise. Frontier AI is a national security technology. It's time to act like it. ⬇️

8,381

Center for AI Safety · Jun 2, 2026 · 5:56 PM UTC

Center for AI Safety

@CAIS

Jun 2

Full announcement: safe.ai/news/cais-names-form…

CAIS Names Former xAI Leader Devin Kim President and Establishes Frontier Security Institute in...

Jun 02, 2026 - CAIS announces the appointment of Devin Kim as its President, and the establishment of the Frontier Security Institute (FSI), a new Washington, D.C.-based organization that will serve...

safe.ai

853