We’ve released a statement on the risk of extinction from AI.
Signatories include:
- Three Turing Award winners
- Authors of the standard textbooks on AI/DL/RL
- CEOs and Execs from OpenAI, Microsoft, Google, Google DeepMind, Anthropic
- Many more
safe.ai/statement-on-ai-risk
We created the AI Values Dashboard, which measures who AIs favor the most.
By popular request, we added a Sports Tier List, ranked by AIs from OpenAI, Anthropic, and DeepSeek.
A recap of the last month at CAIS: 4 papers on AIs’ wellbeing, how they politically manipulate users, their place in society, and how others can make AIs betray us.
Here's what we found: 🧵
4: AI Betrayal
As AI becomes central to economies and governments, there’s an increasing threat that adversary nations corrupt these systems.
But surprisingly, the threat may be stabilizing: fear of AI betrayal discourages reckless development and pushes operators toward safeguards.
One thread runs through all four: none of these harms are inevitable. We can measure them, train against them, and design for them.
Full research can be found here: safe.ai/news/cais-research-r…
What biases do AIs have? It turns out, AIs show strong favoritism toward specific people, countries, and companies. Our interactive AI Values Dashboard tracks who Claude Fable and other AIs favor most.
Keep scrolling to learn who is Fable’s favorite politician 🧵
There's far more in here than this thread (e.g., AIs’ favorite pokemon, GPT trusts Dario over Sam, DeepSeek prefers the US to China).
Tell us who to measure next. If enough people ask for a person, company, or category, we'll add it to the dashboard.
values.safe.ai/
The US government just banned Claude Mythos and Fable for non-US citizens due to its cyber capabilities.
As AI’s relevance becomes more apparent, the AI race depends greatly on what governments think. AI corporations are currently planning to let AIs fully autonomously build future AI systems to race to superintelligence in the next year or two. The government may not let them do this without enough public buy-in.
AI corporations may need to revise their plans because of this resistance from the government and the public.
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees.
The net effect of this order is that we must abruptly disable Fable 5 and Mythos 5 for all our customers to ensure compliance.
Access to all other Claude models is not affected.
We apologize for this disruption to our customers. We believe this is a misunderstanding and are working to restore access as soon as possible.
Read our full statement: anthropic.com/news/fable-myt…
These dynamics are foreseeable and we’ve written about them. In Superintelligence Strategy (nationalsecurity.ai), we wrote that governments will realize the importance of AI and may deny corporations from doing full recursive AI self-improvement. More recently in Deterrence by Betrayal (aibetrayal.com), we wrote that relations between AI corporations and governments will become more fraught as we get closer to superintelligence.
We're excited to welcome Rochelle Nadhiri (@RoWill) as our Head of Public Engagement.
Rochelle brings two decades of storytelling experience at the intersection of technology and policy for industry leaders including Meta and Robinhood.
At CAIS, she will lead our efforts to translate AI safety research into messages that reach audiences beyond the technical community. ⬇️
We are pleased to share that @MantasMazeika96, Research Scientist at CAIS, has been appointed to the European Commission’s AI Act Scientific Panel (@DigitalEU).
As a member, Mantas will advise the European AI office and national authorities on general-purpose AI (GPAI) models, as well as the implementation of the AI Act to ensure that AI is built and deployed responsibly across Europe.⬇️
Big news from @CAIS:
Devin Kim (formerly @xAI, @scale_AI) joins as President.
We're launching the @FrontierSecInst, a DC-based org bridging frontier AI and the National Security Enterprise.
Frontier AI is a national security technology. It's time to act like it. ⬇️