The Jensen Huang episode. 0:00:00 – Is Nvidia’s biggest moat its grip on scarce supply chains? 0:16:25 – Will TPUs break Nvidia’s hold on AI compute? 0:41:06 – Why doesn’t Nvidia become a hyperscaler? 0:57:36 – Should we be selling AI chips to China? 1:35:06 – Why doesn’t Nvidia make multiple different chip architectures? Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!
534
1,096
8,537
6,187,794
Does everyone secretly feel that if you went back you to college now, you could learn so much faster, better, and deeper? And that your 18 year old self kinda wasted the opportunity?
1,667
1,174
26,644
2,253,579
The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self driving took so long 1:57:08 - Future of education Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!
535
2,903
18,572
10,750,315
It’s funny how China has basically the inverse problem as America. We subsidize demand and restrict supply. They subsidize supply and restrict demand. We can’t rebuild fallen bridges. They build bridges to nowhere. In the most desirable cities in this country, every random Victorian house and park bench is a historic site that can’t be disturbed. They’ll bulldoze a 500 year old temple to build a skyscraper complex that no one wants to live in.
148
765
12,630
925,892
Tony Blair explains the 3 key decisions Lee Kuan Yew took that made Singapore rich. “Each one of them now seems obvious. Each one at the time was deeply contested.”
292
2,905
11,581
2,140,121
Tomorrow
360
220
9,691
457,596
I still haven't heard a good answer to this question, on or off the podcast. AI researchers often tell me, "Don't worry bout it, scale solves this." But what is the rebuttal to someone who argues that this indicates a fundamental limitation?
699
674
7,756
3,029,070
My family moved to the US when I was 8, but by the time I turned 20, my dad was still on an H1B (waiting to get processed for a green card). Once I turned 21, I would age out as his dependent, despite the fact that I basically grew up in the US. I thought I'd have to become a code monkey after college, and even that only if I was lucky enough to win the H1B lottery. Otherwise, back to India. I had become a huge fan of @paulg's essays in college. I was actually depressed that my desire to start a startup or do something entrepreneurial was basically hopeless. Working on the promising podcast I was doing as a side project? A beyond impossible pipe dream. Even after 9 years, my dad wasn't able to get a green card - and the lines were only getting longer over time. I figured I'd be an old man before I could quit some FANG job and build my own thing. By some miracle, COVID travel restrictions cleared out the lines, and I got my green card literally months before I would have aged out. If not for this unbelievable coincidence, I would not be hosting the podcast. In the best case, I would be shifting pixels around in the 3rd sub-sub-menu of some big tech software. I'm incredibly grateful I made it through. But it's unconscionable that we put the kids of high skilled immigrants through all this anxiety, and in many cases make them repeat the nerve-racking indentured life trajectory that they had to watch their parents go through.
570
1,099
9,970
2,258,111
Kids, don't underestimate the power of a cold email. (From @Microsoft's YouTube channel - link below)
103
357
6,554
533,501
.@satyanadella on: - why he doesn’t believe in AGI but does believe in 10% economic growth - Microsoft’s new topological qubit breakthrough and gaming world models - whether Office commoditizes LLMs or the other way around Links below. Enjoy! Timestamps 0:00:00 - Intro 0:05:48 - AI won't be winner-take-all 0:16:02 - World economy growing by 10% 0:22:23 - Decreasing price of intelligence 0:31:03 - Microsoft's Quantum breakthrough 0:43:35 - Microsoft's gaming world model 0:50:35 - Legal barriers to AI 0:56:30 - Getting AGI safety right 1:05:43 - 34 years at Microsoft 1:11:31 - Does Satya Nadella believe in AGI?
174
778
6,129
2,263,930
I find it frustrating that almost every nonfiction book is basically just a history lesson, even if it's nominally about some science/tech/policy topic. Nobody will just explain how something works. Books about the semiconductor industry will never actually explain the basic process flow inside a fab, but you can bet that there will be a minute-by-minute recounting of a dramatic 1980s Intel boardroom battle.
417
190
5,619
602,061
.@satyanadella shows me the first (and currently only) Majorana 1 quantum computing chip
A couple reflections on the quantum computing breakthrough we just announced... Most of us grew up learning there are three main types of matter that matter: solid, liquid, and gas. Today, that changed. After a nearly 20 year pursuit, we’ve created an entirely new state of matter, unlocked by a new class of materials, topoconductors, that enable a fundamental leap in computing. It powers Majorana 1, the first quantum processing unit built on a topological core. We believe this breakthrough will allow us to create a truly meaningful quantum computer not in decades, as some have predicted, but in years. The qubits created with topoconductors are faster, more reliable, and smaller. They are 1/100th of a millimeter, meaning we now have a clear path to a million-qubit processor. Imagine a chip that can fit in the palm of your hand yet is capable of solving problems that even all the computers on Earth today combined could not! Sometimes researchers have to work on things for decades to make progress possible. It takes patience and persistence to have big impact in the world. And I am glad we get the opportunity to do just that at Microsoft. This is our focus: When productivity rises, economies grow faster, benefiting every sector and every corner of the globe. It’s not about hyping tech; it’s about building technology that truly serves the world.
Community note
Microsoft's supporting paper, published in Nature, does not support the claim that they have created a topological qubit. nature.com/articles/s4158… Peer reviewers of the Nature paper expressed concern that the paper misleadingly implies that a topological qubit was demonstrated or otherwise achieved: static-content.springer.com/esm/art%3A10.1
66
574
5,376
520,736
Tomorrow
165
106
5,533
661,011
The most interesting part for me is where @karpathy describes why LLMs aren't able to learn like humans. As you would expect, he comes up with a wonderfully evocative phrase to describe RL: “sucking supervision bits through a straw.” A single end reward gets broadcast across every token in a successful trajectory, upweighting even wrong or irrelevant turns that lead to the right answer. > “Humans don't use reinforcement learning, as I've said before. I think they do something different. Reinforcement learning is a lot worse than the average person thinks. Reinforcement learning is terrible. It just so happens that everything that we had before is much worse.” So what do humans do instead? > “The book I’m reading is a set of prompts for me to do synthetic data generation. It's by manipulating that information that you actually gain that knowledge. We have no equivalent of that with LLMs; they don't really do that.” > “I'd love to see during pretraining some kind of a stage where the model thinks through the material and tries to reconcile it with what it already knows. There's no equivalent of any of this. This is all research.” Why can’t we just add this training to LLMs today? > “There are very subtle, hard to understand reasons why it's not trivial. If I just give synthetic generation of the model thinking about a book, you look at it and you're like, 'This looks great. Why can't I train on it?' You could try, but the model will actually get much worse if you continue trying.” > “Say we have a chapter of a book and I ask an LLM to think about it. It will give you something that looks very reasonable. But if I ask it 10 times, you'll notice that all of them are the same.” > “You're not getting the richness and the diversity and the entropy from these models as you would get from humans. How do you get synthetic data generation to work despite the collapse and while maintaining the entropy? It is a research problem.” How do humans get around model collapse? > “These analogies are surprisingly good. Humans collapse during the course of their lives. Children haven't overfit yet. They will say stuff that will shock you. Because they're not yet collapsed. But we [adults] are collapsed. We end up revisiting the same thoughts, we end up saying more and more of the same stuff, the learning rates go down, the collapse continues to get worse, and then everything deteriorates.” In fact, there’s an interesting paper arguing that dreaming evolved to assist generalization, and resist overfitting to daily learning - look up The Overfitted Brain by @erikphoel. I asked Karpathy: Isn’t it interesting that humans learn best at a part of their lives (childhood) whose actual details they completely forget, adults still learn really well but have terrible memory about the particulars of the things they read or watch, and LLMs can memorize arbitrary details about text that no human could but are currently pretty bad at generalization? > “[Fallible human memory] is a feature, not a bug, because it forces you to only learn the generalizable components. LLMs are distracted by all the memory that they have of the pre-trained documents. That's why when I talk about the cognitive core, I actually want to remove the memory. I'd love to have them have less memory so that they have to look things up and they only maintain the algorithms for thought, and the idea of an experiment, and all this cognitive glue for acting.”
The @karpathy interview 0:00:00 – AGI is still a decade away 0:30:33 – LLM cognitive deficits 0:40:53 – RL is terrible 0:50:26 – How do humans learn? 1:07:13 – AGI will blend into 2% GDP growth 1:18:24 – ASI 1:33:38 – Evolution of intelligence & culture 1:43:43 - Why self driving took so long 1:57:08 - Future of education Look up Dwarkesh Podcast on YouTube, Apple Podcasts, Spotify, etc. Enjoy!
228
715
5,197
1,050,496
I regret to inform you that blocking Twitter and YouTube on your phone + work computer indeed improves your productivity, sleep, and how you spend your free time.
121
245
5,192
253,634
Curtis Yarvin is mistaken when he says that Apple can produce iPhones because it's a monarchy. There are millions of firms ("monarchies") in the world that can't produce anything nearly as impressive as iPhones, from the laundromat down the street to Boeing. Apple is the result of a decades long evolutionary process facilitated by the market which uplifts the very best culture, talent, processes, and ideas in the entire world. And the moment Apple slips, it'll get replaced (the average lifespan of a Fortune 500 is 15 years). Governments just don't work this way. Xi Jinping isn't competing again a million counterfactual Chinese leaders who didn't do Zero-COVID, avoided deflation, didn't kill the tech industry, and were awake to AGI. He can fuck up as much as he wants. If a monarch happens to be competent, like Lee Kuan Yew, it's merely by chance, not due to some intrinsic selection mechanism of monarchy that we can replicate. You are just as likely to get brutal dictators like Mao and Stalin by chance - this is not a reasonable gamble to take with the lives of hundreds of millions of citizens. Apple is indeed a wonderfully competent organization - if we want more of the world to be run competently, we should delegate more functions to the market, which is constantly and ruthlessly sizing down incompetence. To be clear, ton of incompetent businesses exist, but they loose access to capital, talent, and power rapidly, which is reallocated to those who can deliver. They don't drag down the fortunes of entire countries and kill millions of people, which has happened again and again in authoritarian systems.
210
406
4,830
929,151
“Had Mao died in 1956, his achievements would have been immortal. Had he died in 1966, he would still have been a great man but flawed. But he died in 1976. Alas, what can one say?” - Chen Yun, a leading Chinese Communist Party official under Mao and Deng Xiaoping
It's taken me a little while to collect my thoughts on DOGE. Here are 50 of them. I'd appreciate your feedback. statecraft.pub/p/50-thoughts…
44
197
4,623
438,391
.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI
247
620
4,456
3,079,334
190
61
4,111
134,154
tfw @gwern casually articulates a more insightful framework for thinking about your life's work than you've ever sketched out yourself.
98
147
4,175
419,948
Has someone come up with a great prompt for socratic tutoring? Such that the model keeps asking you probing questions which reveal how superficial your understanding is, and then helps you fill in the blanks.
159
198
4,090
425,638
.@satyanadella gave me and @dylan522p an exclusive tour of Fairwater 2, the most powerful AI datacenter in the world. We then chatted through Satya's vision for Microsoft in a world with AGI. 0:00:00 - Fairwater 2 0:04:15 - Business models for AGI 0:13:42 - Copilot 0:20:56 - Whose margins will expand most? 0:37:12 - MAI 0:48:42 - The hyperscale business 1:03:39 - In-house chip & OpenAI partnership 1:10:30 - The CAPEX explosion 1:16:01 - Will the world trust US companies to lead AI? Look up Dwarkesh Podcast on Youtube, Apple Podcasts or Spotify to tune in.
235
620
3,755
1,894,636
This question is even more puzzling and salient given the existence of Deep Research
I still haven't heard a good answer to this question, on or off the podcast. AI researchers often tell me, "Don't worry bout it, scale solves this." But what is the rebuttal to someone who argues that this indicates a fundamental limitation?
332
238
3,627
1,230,313
Liang Wenfeng, CEO of DeepSeek, has an open invite to my podcast. If you're in a position to relay this message to him, I appreciate it!
63
119
3,504
214,748
Zuck on: - Llama 3 - open sourcing towards AGI - custom silicon, synthetic data, & energy constraints on scaling - Caeser Augustus, intelligence explosion, bioweapons, $10b models, & much more Enjoy! Links below
117
342
3,184
870,873
I'm so pleased to present a new book with @stripepress: "The Scaling Era: An Oral History of AI, 2019-2025." Over the last few years, I interviewed the key people thinking about AI: scientists, CEOs, economists, philosophers. This book curates and organizes the highlights across all these conversations. You get to see thinkers across many, many fields address the same gnarly questions: “What is the true nature of intelligence? What will change from the millions of machine intelligences running around? What exactly will it take to get there?” Settled answers are unavailable; we’re all running unsupervised. But between these discussions lie, I hope, some insights on the most interesting and important questions of our era. Link below. Enjoy!
152
267
3,285
540,010
Unreasonably effective writing advice: "What are you trying to say here? Okay, just write that."
48
193
3,242
653,714
New post Everyone is sleeping on the *collective* advantages AIs will have, which have nothing to do with raw IQ - they can be copied, distilled, merged, scaled, and evolved in ways humans simply can't. I talk about how a firm of AGIs might work. 1. Copy
132
326
3,111
885,918
Japan was richer per capita than the US in the late 1980s. Today it sits at the bottom among developed countries. How does an economic superpower fall this far and never recover? Kenneth Rogoff (former Chief Economist of IMF) walked me through what he believes was a catastrophic mistake. In 1985, the US pressured Japan to rapidly strengthen the yen and liberalize its financial markets through the Plaza Accord. The yen doubled in value in just 3 years. To offset the economic shock, Japan slashed interest rates and flooded the economy with cheap credit. Japanese banks, suddenly freed from decades of tight regulation, went on a lending spree. They poured money into real estate and stocks with little risk assessment. Japan's stock market became worth more than the US stock market despite having half the population. The total value of Japanese real estate was 4 times the value of all US real estate. When the bubble burst in 1991, banks were left with massive bad loans. The entire financial system seized up, creating a "lost decade" of deflation and stagnation. Here's what stunned me: Rogoff estimates Japan would be 50% wealthier per person today without this crisis. I didn't grasp before this interview how devastating financial crises are. They don't just cause a temporary recession - they permanently alter a country's growth trajectory. Three decades later, Japan still hasn't recovered. Full interview with @krogoff out tomorrow.
159
419
3,060
426,661
.@CJHandmer on how to feed the AIs. 0:00:00 – Why doesn’t China win by default 0:08:28 – Why hyperscalers choose natural gas over solar 0:18:01 – Solar's astonishing learning rates 0:27:02 – How to build 50,000 acre solar-powered data centers 0:40:24 – Environmental regulations blocking clean energy 0:44:04 – Batteries replacing the grid 0:49:14 – GDP is broken, AGI's true value must be measured in total energy use 0:58:45 – Silicon wafers in space with one mind each Available on Apple Podcasts, Spotify, YouTube, etc. Enjoy!
215
520
2,854
1,408,057
Do we have a satisfying answer for why we don’t have access to memories from infancy?
46
181
2,992
219,981
144
24
2,885
97,137
.@leopoldasch on: - the trillion dollar cluster - unhobblings + scaling = 2027 AGI - CCP espionage at AI labs - leaving OpenAI and starting an AGI investment firm - dangers of outsourcing clusters to the Middle East - The Project Full episode (including the last 32 minutes cut by Twitter) available at links below. Enjoy!
99
302
2,713
3,252,790
China spent 25 years failing to build a globally competitive domestic car industry through hundreds of billions of dollars of subsidies and forced partnerships. Then they tried something radical. In 2018, they invited Tesla to build a wholly-owned factory in Shanghai. No joint venture required. Chinese officials called it the "catfish effect" - Tesla would force domestic companies to compete or die. The impact was brutal. When Tesla's Model 3 launched in 2020, it quickly became China's best-selling EV. BYD's total vehicle sales actually fell 7.7% that year to just 427,000 units. By competing with the world's best, BYD was forced to address what they were missing: Chinese EVs had great battery tech and software, but the cars weren't appealing. So BYD learned design. Sales exploded. Today BYD sells over 4 million vehicles annually - 10x their 2020 numbers, and more than Tesla globally. America should do the same thing. In the long run, the only way to win high-tech manufacturing is to actually compete against the very best in the world. We should encourage BYD and other leading Chinese companies to build factories in America in exchange for the access to the U.S market. This would allow us to build up process knowledge and the the relevant supply chains domestically, and force American companies to catch up. The alternative is to fall further and further behind. I talked to Arthur Kroeber @arkroeber, one of the sharpest observers of China’s economy, about this and much more. Full episode out tomorrow.
146
374
2,692
424,106
I asked Victor Shih this question - why has the Chinese stock market been flat for so long despite the economy growing so fast? This puzzle is explained via China's system of financial repression. If you save money in China, banks are not giving you the true competitive interest rate. Rather, they'll give you the government capped 1.3% (lower than inflation, meaning you're earning a negative return). The net interest (which is basically a tax on all Chinese savers) is shoveled into politically favored state owned enterprises that survive only on subsidized credit. But here's what I didn't understand at first: Why don't companies just raise equity capital and operate profitably for shareholders? The answer apparently is that there's no 'outside' the system. The state doesn't just control credit - it controls land, permits, market access, even board seats through Party committees. Companies that prioritize profits over market share lose these privileges. Those that play along get subsidized loans, regulatory favors, and government contracts. Regular savers, founders, and investors are all turned into unwitting servants of China's industrial policy
What's the best explanation for why, despite China's economy booming, the Shanghai Stock Exchange has been flat for decades?
91
316
2,617
505,723
On a viewer-minute adjusted basis, I host the Sarah Paine podcast, where I sometimes also talk about AI. I'm very excited to announce a new 6-part in-person lecture series with her in San Francisco in July. We're giving the tickets away for free. Links below.
82
61
2,658
809,735
In 2 weeks, I’m going to be releasing an episode with 2 guests who have 2045+ timelines, are skeptical of the whole “alignment” framework, and don’t think an intelligence explosion is plausible.
89
53
2,604
249,847
The @gwern interview. 0:00:00 – Anonymity 0:01:09 – Automating Steve Jobs 0:04:38 – Isaac Newton's theory of progress 0:06:36 – Grand theory of intelligence 0:10:39 – Seeing scaling early 0:21:04 – AGI Timelines 0:22:54 – What to do in remaining 3 years until AGI 0:26:29 – Influencing the shoggoth with writing 0:30:50 – Human vs artificial intelligence 0:33:52 – Rabbit holes 0:38:48 – Hearing impairment 0:43:00 – Wikipedia editing 0:47:43 – Gwern dot net 0:50:20 – Counterfactual careers 0:54:30 – Borges & literature 1:01:32 – Gwern's process 1:19:17 - Gwern's finances 1:25:05 - Random
100
287
2,636
586,400
Dominic Cummings: "Imagine if Steve Jobs or Tim Cook or Patrick Collison actually spent a large part of their day just doing photo ops. You have a person whose time is the single most precious asset. Yet, they're actually standing with the ambassador from Tonga Zonga, just doing photo ops for a large part of the day or going to stupid ceremonies. These people have all grown up in a system where they just don't know any better than dealing with the media all day." @Dominic2306, former Chief Advisor to PM, on the complete misprioritization of ministers' time and efforts. Full episode out Wednesday.
81
256
2,374
749,654
I hassled @tylercowen about why he doesn't expect explosive economic growth from AGI. How could we possibly add 100 billion extra workers and only get 0.5% more growth? Also featuring Stalin's library, EU decels, and how Churchill was an underachiever. Hilarious and provocative throughout. My 4th interview with Tyler. He surprises me every time. Enjoy! Timestamps 0:00:00 - Economic Growth and AI 0:15:45 - Founder Mode and increasing variance 0:30:19 - Effective Altruism and Progress Studies 0:33:53 - What AI changes for Tyler 0:45:45 - The slow diffusion of innovation 0:50:41 - Stalin's library 0:53:07 - DC vs SF vs EU
91
217
2,487
1,323,873
I have been obsessed with what the geneticist David Reich told me in our interview together. The story of human evolution we're now learning from new evidence is so crazy. 70,000 years ago, half a dozen different species of humans (Neanderthals, Denisovans, 'Hobbits', etc) lived across Eurasia. And then some small group of modern humans (only 1,000 to 10,000 people) drove all of them to extinction. Everyone native to Eurasia and America is descended from this one tribe. Here's the crazy part - modern humans with language and big brains have been around for hundreds of thousands of years. And we had ventured out of Africa before. But we were always beat back by these other humans. What did this small group of humans 70,000 years ago figure out such that they completely dominated the planet? Full episode out Thursday.
112
192
2,385
419,415
I had no idea how wild the story of human evolution was before chatting with the geneticist of ancient DNA David Reich. Human history has been again and again a story of one group figuring ‘something’ out, and then basically wiping everyone else out. From the tribe of 1k-10k modern humans 70,000 years ago who killed off all the other human species; to the Yamnaya horse nomads 5,000 years ago who killed off 90+% of (then) Europeans and also destroyed the Indus Valley Civilization. So much of what we thought we knew about human history is turning out to be wrong, from the ‘Out of Africa’ theory to the evolution of language, and this is all thanks to the research from David Reich’s lab. Extremely fascinating stuff. Enjoy! Links below.
74
358
2,575
600,935
LLMs are 5/10 writers. So the fact that you can reliably improve on the explanations in papers and books by asking an LLM to summarize them is a huge condemnation of academic writing.
140
126
2,322
279,736
I studied computer science in college, but I wised up after I graduated, and transitioned to the more risk-averse conservative career of podcasting.
Software developer job postings over the last five years Hard to find a crazier chart
35
44
2,283
123,655
Liang Mong Song was TSMC's renegade genius. He gets frustrated with the leadership, so he defects to South Korea and single handedly pushes Samsung to overtake TSMC. Then SMIC, China's main foundry, poaches him. He becomes CEO and brings along a conga line of Samsung and TSMC's top talent. He is responsible for pushing China to 7nm (close to the leading edge). "That guy's a genius. He's like 78 and he's beyond brilliant. Does not care about people." Full episode with @asianometry and @dylan522p out Wednesday. 3 hour bonanza on the semiconductor industry.
47
217
2,211
349,311
Read CH 1, 2, and 3 of this book. Go into debt if you have to.
I sometimes help my friends rewrite their announcements/launches/blog posts for Twitter. Sharing what ends up commonly helpful. 90% of my value ends up being just getting them to say what they're trying to say. Literally the first thing I do is discard their current draft, turn on Whisper narration, and just ask them to explain their idea to me like I was hearing about it for the first time. Every single time, it's immediately so much better than what they'd written before. What commonly changes: The first thing they say is closer to the material in the original draft's paragraph 4. Something about writing down an essay or thread makes people feel the need to clear their throat for 3 paragraphs. If you were explaining your company or blog post idea to me at lunch, you wouldn't open with, "For years, our research community has been..." You'd say, "We're making x for y. The way we do this is..." Other advice: 1. Be as concrete as possible. People don't have the tangible understanding of your company or result that you do. So you need to first just explain what you're actually doing before you start talking about the underlying motivations and the big implications. I was complaining earlier about ML announcements. Apply this to your domain: "Every time someone proposes training a model for some domain (virtual cells/materials/whatever), the first sentence should be: input data: x, output data: y, loss function: z. Otherwise I have no idea what's actually being proposed." 2. Cut fluff ruthlessly. People are going to read 10% of whatever you want to write. Have it be the 10% you chose, rather than the 10% they randomly stumbled upon.
28
104
2,178
250,096
Wikipedia is fucking incredible. Concise, so well organized and easy to read, frictionless rabbit holing, filled with concrete details (unlike many books, which are largely vacuous monologues). People often ask me for book recs. Most books suck. Read Wikipedia instead lol.
136
95
2,107
112,939
On a Saturday!
insane that he responded in 4 mins
41
43
2,057
174,321
.@dylan522p called it in Oct 2024.
Announcing The Stargate Project The Stargate Project is a new company which intends to invest $500 billion over the next four years building new AI infrastructure for OpenAI in the United States. We will begin deploying $100 billion immediately. This infrastructure will secure American leadership in AI, create hundreds of thousands of American jobs, and generate massive economic benefit for the entire world. This project will not only support the re-industrialization of the United States but also provide a strategic capability to protect the national security of America and its allies. The initial equity funders in Stargate are SoftBank, OpenAI, Oracle, and MGX. SoftBank and OpenAI are the lead partners for Stargate, with SoftBank having financial responsibility and OpenAI having operational responsibility. Masayoshi Son will be the chairman. Arm, Microsoft, NVIDIA, Oracle, and OpenAI are the key initial technology partners. The buildout is currently underway, starting in Texas, and we are evaluating potential sites across the country for more campuses as we finalize definitive agreements. As part of Stargate, Oracle, NVIDIA, and OpenAI will closely collaborate to build and operate this computing system. This builds on a deep collaboration between OpenAI and NVIDIA going back to 2016 and a newer partnership between OpenAI and Oracle. This also builds on the existing OpenAI partnership with Microsoft. OpenAI will continue to increase its consumption of Azure as OpenAI continues its work with Microsoft with this additional compute to train leading models and deliver great products and services. All of us look forward to continuing to build and develop AI—and in particular AGI—for the benefit of all of humanity. We believe that this new step is critical on the path, and will enable creative people to figure out how to use AI to elevate humanity.
47
118
2,009
252,049
This was really interesting. The goal was to try and predict prices a minute out from an order book of previous bids and asks. Of course, my very simple linear model obviously would not successfully predict the trajectory of mid price in a real market. But the exercise helped me get some intuition for the kind of feature engineering that's necessary to start making sense of the terabytes of market data.
We invited @dwarkesh_sp to tackle a foundational question in quant trading: What does it take to build a predictive signal from market data? We loved showing him what makes work at HRT so fun — and why, in Marc’s words, “it occupies a lot of very smart people for years.”
18
83
2,015
340,831
I sometimes help my friends rewrite their announcements/launches/blog posts for Twitter. Sharing what ends up commonly helpful. 90% of my value ends up being just getting them to say what they're trying to say. Literally the first thing I do is discard their current draft, turn on Whisper narration, and just ask them to explain their idea to me like I was hearing about it for the first time. Every single time, it's immediately so much better than what they'd written before. What commonly changes: The first thing they say is closer to the material in the original draft's paragraph 4. Something about writing down an essay or thread makes people feel the need to clear their throat for 3 paragraphs. If you were explaining your company or blog post idea to me at lunch, you wouldn't open with, "For years, our research community has been..." You'd say, "We're making x for y. The way we do this is..." Other advice: 1. Be as concrete as possible. People don't have the tangible understanding of your company or result that you do. So you need to first just explain what you're actually doing before you start talking about the underlying motivations and the big implications. I was complaining earlier about ML announcements. Apply this to your domain: "Every time someone proposes training a model for some domain (virtual cells/materials/whatever), the first sentence should be: input data: x, output data: y, loss function: z. Otherwise I have no idea what's actually being proposed." 2. Cut fluff ruthlessly. People are going to read 10% of whatever you want to write. Have it be the 10% you chose, rather than the 10% they randomly stumbled upon.
55
103
2,228
425,988
Join us to hear from @collision and @patrickc about trends that will shape the next decade of payments and global trade—with guest @dwarkesh_sp. Streaming live Tuesday, May 6 at 4 pm PT. nitter.app/i/broadcasts/1zqKVjWwn…
30
6
1,941
99,370
On the odds of a Taiwan invasion & how the CCP thinks w. Naval War College historian Sally Paine Full episode out tomorrow: "the West learned that you read improbable speeches ... let's judge Xi Jinping at his word, & he says he's going to go for it We're at an inflection point - a lot of educated people and businesses want to make autonomous decisions ... and the Communist Party said that's off the table."
68
207
1,823
821,929
I read a lot of books for my podcast & blog. But often I can't finding the particular passage I'm looking for. Ctrl-F doesn't work unless you know the exact phrase. So I built search for ebooks using OpenAI's embeddings API. Link below to use. Works surprisingly well!
48
189
1,845
Made myself a research tool Claude makes flash cards from passages I highlight Reduces the friction of using spaced repetition. Sharing in case helpful to you! generateflash dot cards
59
78
1,890
228,149
Not sure I agree that hosting a podcast qualifies me as one of the 100 most influential people in AI haha. But I'm honored!
TIME's new cover: The 100 most influential people in AI ti.me/4dQcJ1Q
143
21
1,817
244,263
Notes on visiting China. On scale, construction, youngsters, public opinion, censorship, tech, and AI. Link below.
75
83
1,835
417,664
The @JeffDean & @NoamShazeer episode. We talk about 25 years at Google, from PageRank to MapReduce to the Transformer to MoEs to AlphaChip – and soon to ASI. My favorite part was Jeff's vision for AGI as one giant MoE that is grown in bits and pieces over time like a forest, rather than trained all at once. Specialization, distillation, inference time scaling all emerge organically rather than by design. Noam bites every bullet: 100x world GDP soon; let’s get a million automated researchers running in the Google datacenter; living to see the year 3000. Links below. Enjoy! Timestamps 0:00:00 - Intro 0:03:29 - Joining Google in 1999 0:06:20 - Future of Moore's Law 0:11:04 - Future TPUs 0:13:56 - Jeff’s undergrad thesis: parallel backprop 0:15:54 - LLMs in 2007 0:25:09 - “Holy shit” moments 0:27:28 - AI fulfills Google’s original mission 0:32:00 - Doing Search in-context 0:36:12 - The internal coding model 0:37:29 - What will 2027 models do? 0:43:20 - A new architecture every day? 0:49:10 - Automated chips and intelligence explosion 0:53:07 - Future of inference scaling 1:02:38 - Already doing multi-datacenter runs 1:08:15 - Debugging at scale 1:12:41 - Fast takeoff and superalignment 1:20:51 - A million evil Jeff Deans 1:24:22 - Fun times at Google 1:27:51 - World compute demand in 2030 1:34:37 - Getting back to modularity 1:44:48 - Keeping a giga-MoE in-memory 1:49:35 - All of Google in one model 1:57:59 - What’s missing from distillation 2:03:10 - Open research, pros and cons 2:09:58 - Going the distance
68
235
1,863
544,888
Boy do you guys have a lot of thoughts about the @RichardSSutton interview. I’ve been thinking about it myself. I have a better understanding of Sutton’s perspective now than I did during the interview itself. So I want to reflect on it a bit. Richard, apologies for any errors or misunderstandings. It’s been very productive to learn from your thoughts. The steelman What is the bitter lesson about? It is not saying that you just want to throw as much compute away as possible. The bitter lesson says that you want to come up with techniques which most effectively and scalably leverage compute. Most of the compute spent on an LLM is used on running it in deployment. And yet it’s not learning anything during this time! It’s only learning during this special phase we call training. That is not an effective use of compute. And even the training period by itself is highly inefficient - GPT-5 was trained on the equivalent of 10s of 1000s of years of human experience. What’s more, during this training phase, all their learning comes straight from human data. This is an obvious point in the case of pretraining data. But it’s even kind of true for the RLVR we do on LLMs: these RL environments are human furnished playgrounds to teach LLMs the specific skills we have prescribed for them. The agent is in no substantial way learning from organic and self-directed engagement with the world. Having to learn only from human data (an inelastic hard-to-scale resource) is not a scalable use of compute. What these LLMs learn from training is not a true world model (which tells you how the environment changes in response to different actions) Rather, they are building a model of what a human would say next. And this leads them to rely on human-derived concepts. If you trained an LLM on the data from 1900, it wouldn't be able to come up with relativity from scratch. Though now that it has a training corpus which explains relativity, it can use that concept to help you with your physics homework. LLMs aren’t capable of learning on-the-job, so we’ll need some new architecture to enable continual learning. And once we have it, we won’t need a special training phase — the agent will just learn on-the-fly, like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. TLDR of my current thoughts My main difference with Rich is that I think the concepts he's using to distinguish LLMs from true intelligence are not actually mutually exclusive and dichotomous. Imitation learning is continuous with and complementary to RL. And relatedly, models of humans can give you a prior which facilitates learning "true" world models. I also wouldn’t be surprised if some future version of test-time fine-tuning could replicate continual learning. Imitation learning is continuous with and complementary to RL I tried to ask Richard a couple of times whether pretrained LLMs can serve as a good prior on which to accumulate the experiential learning (aka do the RL) which will lead to AGI. In a talk a few months ago, @ilyasut compared pretraining data to fossil fuels. This analogy has remarkable reach. Just because fossil fuels are not renewable does not mean that our civilization ended up on a dead-end track by using them. You simply couldn't have transitioned from the water wheels in 1800 to solar panels and fusion power plants. We had to use this cheap, convenient, plentiful intermediary. AlphaGo (which was conditioned on human games) and AlphaZero (which was bootstrapped from scratch) were both superhuman Go players. AlphaZero was better. Will we (or the first AGIs) eventually come up with a general learning technique that requires no initialization of knowledge - that just bootstraps itself from the very start? And will it outperform the very best AIs that have been trained to that date? Probably yes. But does this mean that imitation learning must not play any role whatsoever in developing the first AGI, or even the first ASI? No. AlphaGo was still superhuman, despite being initially shepherded by human player data. The human data isn’t necessarily actively detrimental - at enough scale it just isn’t significantly helpful. The accumulation of knowledge over tens of thousands of years has clearly been essential to humanity’s success. In any field of knowledge, thousands (and likely millions) of previous people were involved in building up our understanding and passing it on to the next generation. We didn't invent the language we speak, nor the legal system we use, nor even most of the knowledge relevant to the technologies in our phones. This process is more analogous to supervised learning than to RL from scratch. Are kids literally predicting the next token (like an LLM) in order to do cultural learning? No, of course not. But neither are they running around trying to collect some well defined reward. No ML learning regime perfectly describes human learning. We do things which are analogous to both RL and imitation learning. I also don't think these learning techniques are categorically different. Imitation learning is just short horizon RL. The episode is a token long. The LLM makes a conjecture about the next token based on its understanding of the world and the other information in the sequence. And it receives reward in proportion to how well it predicted the true token. Now, I already hear people saying: “No no, that’s not ground truth! It’s just learning what a human was likely to say.” Agreed. But there’s a different question which I think is more relevant to the scalability of these models: can we leverage imitation learning to help models learn better from ground truth? And I think the answer is, obviously yes? We have RLed models to win Gold in IMO and code up entire working applications from scratch. These are “ground truth” examinations. But you couldn’t RL a model to accomplish these feats from scratch. You needed a reasonable prior over human data in order to kick start the RL process. Whether you wanna call this prior a proper "world model", or just a model of humans, seems like a semantic debate to be honest. Because what you really care about is whether this model of humans helps you start learning from ground truth (aka become a “true” world model). It’s a bit like saying to someone pasteurizing milk, “Hey stop boiling that milk - we eventually want to serve it cold!” Yes, of course. But this is an intermediate step to facilitate the final output. By the way, LLMs have clearly developed a representation of the world (because their training process incentivizes them to). I use LLMs to teach me about everything from biology to AI to history, and they do so with remarkable flexibility and coherence. Are LLMs specifically trained to model how their actions will affect the world? No. But if we're not allowed to call their representations a “world model”, then we're defining the term by the process we think is necessary to build one, rather than by the obvious capabilities the concept implies. Continual learning Sorry to bring up my hobby horse again. I'm like a comedian who's only come up with one good bit. An LLM being RLed on outcome-based rewards learns O(1) bits per episode, and the episode may be tens of thousands of tokens long. We animals clearly extract far more information from interacting with their environment than just the reward signal at the end of each episode. Conceptually, how should we think about what is happening with animals? We’re learning to model the world through observations. The outer loop RL is incentivizing some other learning system to pick up maximum signal from the environment. In Richard’s OaK architecture, he calls this the transition model. If we were trying to pigeonhole this feature spec into LLMs, what you’d do is to fine tune on all your observed tokens. From what I hear from my researcher friends, in practice the most naive way of doing this doesn't work well. Being able to continuously learn from the environment in a high throughput way is obviously necessary for true AGI. And it clearly doesn’t exist with LLMs trained on RLVR. But there might be some relatively straightforward ways to shoehorn continual learning atop LLMs. For example, one could imagine making SFT a tool call for the model. So the outer loop RL is incentivizing the model to teach itself effectively using supervised learning, in order to solve problems that don't fit in the context window. I'm genuinely agnostic about how well techniques like this will work—I'm not an AI researcher. But I wouldn't be surprised if they basically replicate continual learning. Models already demonstrate something resembling human continual learning within their context windows. The fact that in-context learning emerged spontaneously from the training incentive to process long sequences suggests that if information could flow across windows longer than the current context limit, models would meta-learn the same flexibility they already show in-context. Concluding thoughts Evolution does meta-RL to make an RL agent. That agent can selectively do imitation learning. With LLMs, we’re going the opposite way. We first made a base model that does pure imitation learning. Then we do RL on it to make it a coherent agent with goals and self-awareness. Maybe this won't work! But I don't think these super first-principle arguments (for example, about how an LLM doesn’t have a true world model) prove much. I also don't think they’re strictly accurate for the models we have today, which undergo a lot of RL on “ground truth”. Even if Sutton's Platonic ideal doesn’t end up being the path to first AGI, his first-principles critique is extremely thought provoking. He’s identifying genuine basic gaps, which we don’t even notice because they are so pervasive in the current paradigm: lack of continual learning, abysmal sample efficiency, dependence on exhaustible human data. If the LLMs do get to AGI first, the successor systems they build will almost certainly be based on Richard's vision.
.@RichardSSutton, father of reinforcement learning, doesn’t think LLMs are bitter-lesson-pilled. My steel man of Richard’s position: we need some new architecture to enable continual (on-the-job) learning. And if we have continual learning, we don't need a special training phase - the agent just learns on-the-fly - like all humans, and indeed, like all animals. This new paradigm will render our current approach with LLMs obsolete. I did my best to represent the view that LLMs will function as the foundation on which this experiential learning can happen. Some sparks flew. 0:00:00 – Are LLMs a dead-end? 0:13:51 – Do humans do imitation learning? 0:23:57 – The Era of Experience 0:34:25 – Current architectures generalize poorly out of distribution 0:42:17 – Surprises in the AI field 0:47:28 – Will The Bitter Lesson still apply after AGI? 0:54:35 – Succession to AI
101
159
1,866
413,750
We’re way more patient in training human employees than AI employees. We will spend weeks onboarding a human employee and giving slow detailed feedback. But we won’t spend just a couple of hours playing around with the prompt that might enable the LLM to do the exact same job, but more reliably and quickly than any human.
114
91
1,785
167,890
Honestly the thing that motivated me to do this episode was learning that there's less than $200M/year of smart philanthropy on factory farming - GLOBALLY. Just to explain how fucking crazy that is: 1. It's insane how cheap the interventions that will spare BILLIONS of animals from gruesome, painful fates have been. Less than $200M has been spent getting corporate commitments that have already spared more than 400M hens from battery cages, and securing pledges that will spare billions more over the years to come. That’s < $1 per 10 years of animal well-being improved. Another example: In-ovo sexing (which determines the sex of eggs pre-birth) has already saved 200M male chicks from maceration at birth (with the potential to spare 7 billion every year). And it only cost ~$10 million to get off the ground. 2. 80 billion land animals are factory farmed every year. That means the ratio is $1 donated : 40,000 animals. 3. Compared to the amount of private philanthropy alone on global health ($50b+/year) or climate change ($15b+/year), the <$200M/year of smart money spent on factory farming is nothing. — The way we treat factory farmed animals is one of the worst atrocities in history. And unfortunately, the problem is on track to get worse every year. The case for optimism: Given how neglected this issue is, the scope of impact even one individual can have is absolutely massive. To be blunt, there are individual readers of this tweet who could DOUBLE the amount of smart money our entire civilization dedicates to this issue. Even with a few million dollars, you could single-handedly improve the lives of millions of factory-farmed animals. DM @Lewis_Bollard if you want to explore contributions over $50k.
Just $1 can help avert 10 years of farmed animal suffering. I decided to give $250,000 as a donation match to @farmkind_giving after learning about the outsized opportunities to help. FarmKind directs your contributions to the most effective charities in this area. Please consider contributing, even if it’s a small amount. Together, we can double each other's impact and give a total of $500,000. Use the link below to donate with my match. Bluntly, there are some listeners who are in a position to give much more. Given how neglected this topic is, one such person could singlehandedly change the game for 10s of billions of animals. If you’re considering donating $50k or more, please reach out directly to @Lewis_Bollard and his team by DMing him, or emailing andres@openphilanthropy.org
61
196
1,800
306,346
Sam Altman suggested that OpenAI could reach $100B in revenue by 2027. Anthropic reportedly forecasted $70 billion in revenue by 2028. Satya reacts to these projections.
76
146
2,413
504,082
the internet is an interesting place
if i were an indian serf ekeing out a modest existence on the deccan plains 200 years ago, the arrival of this man would compel me to hand over all my taxed grains. were i to find out that the raja's beautiful son had perished on the battlefield, my tears would salt the rivers.
24
26
1,737
139,884
Amazon's new 1000MW nuclear powered datacenter campus. Dario was right lol From our Aug 2023 interview: "Dario Amodei 01:14:36: There was a running joke that the way building AGI would look is, there would be a data center next to a nuclear power plant next to a bunker. We'd all live in the bunker and everything would be local so it wouldn't get on the Internet."
27
169
1,723
338,782
Reading while constantly asking Claude questions is 2x harder and 4x more valuable. Bloom 2 Sigma on demand.
48
75
1,687
176,914
Will Zuck open source the $10 billion model?
31
134
1,668
227,787
22
41
1,632
111,363
How exciting is my life? Identify the most interesting questions. Hunt down the smartest people in the world to answer them. Build a shadow university for myself, populated by some of the smartest mentors and friends one could be lucky enough to have. What would it look like to actually, *really*, try at this quest? What would someone more capable, ambitious, and driven than me do? And why I am not doing it already?
104
47
1,685
140,580
"I wonder for people in their 20s if they shouldn't go to San Francisco. The entrepreneurs are held in excessively high regard in my view. I think that San Francisco doesn't really encourage the pursuit of really deep technical depth." - @patrickc Full episode out tomorrow
43
105
1,620
526,185
An underrated benefit of democracy: Bad policies are often continued just because it’s too embarrassing for a regime to admit that it has been making a big mistake all along. This explains why Zero COVID or the One Child Policy were continued for far longer than they were even remotely reasonable. But in democracies, new leaders can credibly and without shame say, “The other party really fucked up. Let’s course correct.” As disastrous as the Iraq War was, within 6 years of its launch, the sitting President was openly calling it a huge boondoggle and trying to wind it down. Even when autocracies manage to get a lot of small things right, they stubbornly cling to their big mistakes, which they can never fix, and which overwhelm any of their positives.
74
96
1,586
134,910
Tony Blair on: - What he tells the dozens of world leaders who come seek advice from him - How much of a PM’s time is actually spent governing - What AI’s July 1914 moment will look like from inside the cabinet - What he learned from Lee Kuan Yew Links below. Enjoy!
33
251
1,534
307,603
Wanted to get better intuitions for how RL works on LLMs. So I wrote a simple script to teach Nanochat to add 5 digit numbers. I was surprised at how fast it learned. Until I looked at the model's generations and realized that it had just learned to always call the built-in Python interpreter 😂. The code I wrote is very remedial, minimal, and inefficient - I'm a professional podcaster, alright? But it might be helpful if you just want to see the basics of how REINFORCE or GRPO work. Link to gist below. Fundamentally, it's not that complicated: generate multiple trajectories per prompt. Update your model to make it more likely that it samples all the tokens in the successful trajectories.
Excited to release new repo: nanochat! (it's among the most unhinged I've written). Unlike my earlier similar repo nanoGPT which only covered pretraining, nanochat is a minimal, from scratch, full-stack training/inference pipeline of a simple ChatGPT clone in a single, dependency-minimal codebase. You boot up a cloud GPU box, run a single script and in as little as 4 hours later you can talk to your own LLM in a ChatGPT-like web UI. It weighs ~8,000 lines of imo quite clean code to: - Train the tokenizer using a new Rust implementation - Pretrain a Transformer LLM on FineWeb, evaluate CORE score across a number of metrics - Midtrain on user-assistant conversations from SmolTalk, multiple choice questions, tool use. - SFT, evaluate the chat model on world knowledge multiple choice (ARC-E/C, MMLU), math (GSM8K), code (HumanEval) - RL the model optionally on GSM8K with "GRPO" - Efficient inference the model in an Engine with KV cache, simple prefill/decode, tool use (Python interpreter in a lightweight sandbox), talk to it over CLI or ChatGPT-like WebUI. - Write a single markdown report card, summarizing and gamifying the whole thing. Even for as low as ~$100 in cost (~4 hours on an 8XH100 node), you can train a little ChatGPT clone that you can kind of talk to, and which can write stories/poems, answer simple questions. About ~12 hours surpasses GPT-2 CORE metric. As you further scale up towards ~$1000 (~41.6 hours of training), it quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests. E.g. a depth 30 model trained for 24 hours (this is about equal to FLOPs of GPT-3 Small 125M and 1/1000th of GPT-3) gets into 40s on MMLU and 70s on ARC-Easy, 20s on GSM8K, etc. My goal is to get the full "strong baseline" stack into one cohesive, minimal, readable, hackable, maximally forkable repo. nanochat will be the capstone project of LLM101n (which is still being developed). I think it also has potential to grow into a research harness, or a benchmark, similar to nanoGPT before it. It is by no means finished, tuned or optimized (actually I think there's likely quite a bit of low-hanging fruit), but I think it's at a place where the overall skeleton is ok enough that it can go up on GitHub where all the parts of it can be improved. Link to repo and a detailed walkthrough of the nanochat speedrun is in the reply.
48
50
1,600
409,598
The @slatestarcodex & @DKokotajlo episode. Scott and Daniel break down every month from now until the 2027 intelligence explosion. Misaligned hive minds, Xi and Trump waking up, automated Ilyas accelerating AI progress. I went in quite skeptical. But I learned a tremendous amount by bouncing different objections against them. Links below. Enjoy! Timestamps: 00:00:00 - AI 2027 00:07:45 - Forecasting 2025 and 2026 00:15:30 - Why LLMs aren't making discoveries 00:25:22 - Debating intelligence explosion 00:50:34 - Can superintelligence actually transform science? 01:17:43 - Cultural evolution vs superintelligence 01:24:54 - Mid-2027 branch point 01:33:19 - Race with China 01:45:36 - Nationalization vs private anarchy 02:04:11 - Misalignment 02:15:41 - UBI, AI advisors, & human future 02:23:49 - Factory farming for digital minds 02:27:41 - Daniel leaving OpenAI 02:36:04 - Scott's blogging advice
55
196
1,588
362,170
.@slatestarcodex's answer. I actually agree with this. But if making new connections and discoveries is intrinsically an NP-hard problem, then it's also less likely that some super-intelligence will rapidly exhaust the tech tree and start making nanobots.
I still haven't heard a good answer to this question, on or off the podcast. AI researchers often tell me, "Don't worry bout it, scale solves this." But what is the rebuttal to someone who argues that this indicates a fundamental limitation?
60
64
1,547
225,761
God grant me the serenity to delegate the things I cannot automate, the courage to automate the things I can, and the wisdom to know the difference.
19
193
1,483
Really interesting new @gwern essay: LLM Daydreaming - Proposal of how default mode networks for LLMs are an example of missing capabilities for search and novelty Btw, I know it's a bit cringe to delight in, but if you had told 19 year old me that a Gwern essay would open like this, I would have found it hard to believe.
48
80
1,563
125,501
Google Books on steroids Search through the contents of 1000s of books! AI semantic search finds the most relevant passages & GPT writes a summary Textbooks, classics, histories, science, economics, tech, philosophy - you name it! Much much more to come 👀 Link below, enjoy!
55
199
1,501
327,978
Sometimes people say that even if all AI progress totally stopped, the systems of today would still be economically transformative. I disagree. The reason that the Fortune 500 aren’t using LLMs to transform their workflows isn’t because the management is too stodgy. Rather, it’s genuinely hard to get normal humanlike labor out of LLMs. And this has to do with some fundamental capabilities these models lack.
New blog post where I explain why I disagree with this, and why I have slightly longer timelines to AGI than many of my guests. I think continual learning is a huge bottleneck to the usefulness of these models, and extended computer use may take years to sort out. L-nk below.
148
67
1,513
254,836
I’m grateful to have what is probably the smartest audience in the world for a show as big as mine. If you would like to advertise on my show, please fill out the form in the link below!
52
44
1,491
322,369
It's funny how strong the causation is between time in SF and short timelines. I've been traveling for 4 weeks, and I'm now up to median year 2036 for AGI.
67
41
1,482
217,610
Microsoft makes software for humans doing work. Office and 365 generate ~$100B in revenue/year. I asked @satyanadella, what will happen to Microsoft as we move towards a world where AIs, not humans, are doing all the knowledge work? An interesting back and forth.
68
161
1,583
241,588
.@realGeorgeHotz and @ESYudkowsky will discuss AI safety and acceleration Live next Tuesday, Aug 15, at 2 PM Pacific nitter.app/i/broadcasts/1nAJErpDY…
82
146
1,381
796,697
.@satyanadella: The true AGI benchmark is whether we achieve 10% economic growth.
48
119
1,410
204,422
Whoah. Father of reinforcement learning and this year's Turing Award winner @RichardSSutton. I'm shy.
Dwarkesh Patel is 100% right on this: AI's utility is very strongly dependent on continual learning. piped.video/nyvmYnz6EAg?si=D2v2…
29
47
1,462
96,546
Who is the best blogger in China? Who is China's Scott Alexander or Gwern or Tyler?
95
49
1,400
217,578
Before I interviewed @Lewis_Bollard, I had assumed that factory farming was on its way out (especially given new tech like cultivated meat around the corner). Unfortunately this is far from inevitable: factory farms are already incredibly efficient machines for making meat (the most efficient broiler chickens convert 1.38 kg of grain into an astonishing 1 kg of flesh). I'd previously assumed that cultivated meat will soon trounce factory farming just on raw economics. Growing meat around a whole creature and mind cannot be the most efficient way to produce tasty flesh, right? Lewis thinks we may be many decades away (at least) from this outcome. Evolution has spent on the order of tens of millions of years optimizing human intelligence. And in order to try replicating this feat, AGI labs have to spend hundreds of billions of dollars a year. Evolution has spent far longer than that (basically the entire time) figuring out how to convert food into meat efficiently. Of course, tech on farms historically has favored more suffering (think gestation crates, battery cages, and overgrown broiler chickens). Improving the conditions on factory farms also requires corporate commitments and regulations against the most cruel practices. Every year we're factory farming about 2% more land animals globally. On the default trajectory, the amount of raw suffering in the world is likely to keep increasing. But there are reasons to think this can change. New technologies like in-ovo sexing have already saved hundreds of millions of male chicks from gruesome fates, with the potential to save billions more. And corporate commitments to go cage-free have already spared well north of 500 million hens from torturous battery cages (again, with the potential to help tens of billions more). Full episode with @Lewis_Bollard out tomorrow.
57
88
1,410
527,611
In the hypothetical world where I interview President @EmmanuelMacron next week, what should I ask him? PS he’s currently deciding whether to do this interview - if you have a channel to him, do let him know that this interview would be a good idea :)
152
25
1,385
69,770
500k! So much of the credit for the growth over the last year belongs to my wonderful editors, Conor, Aaron, and Ishan. Conor was a farmer in Argentina, Ishan a Freshman maths student in Sri Lanka, and Aaron an editor for Mr Beast. I'm proud of them!
58
15
1,359
66,906
How @_sholtodouglas got scouted by Google DeepMind: “Every night from 10 PM till 2 AM, I would do my own research. @jekbradbury saw some of my questions online and was like, ‘I thought I knew all the people in the world who were asking these questions. Who on Earth are you?’”
A good example is @_sholtodouglas at @GoogleDeepMind. He's quiet on Twitter, doesn't have any flashy first-author publications, and has only been in the field for ~1.5 years, but people in AI know he was one of the most important people behind Gemini's success
21
113
1,475
570,615
Looking for an energy expert to interview on my podcast. I want to get in the weeds on what will happen the wild AI worlds. As AI actually becomes capable of substituting for human labor, your country's GDP will be denominated by your AI population size, which is downstream of energy. What does this mean for different countries? Given how fast the US falling behind China in electricity generation, what would it take for us to make up the ground? What are the most plausible sources (natural gas, nuclear, solar), what are their supply curves, the main physical or regulatory bottlenecks that would slow down a ramp up, etc. Who's the right guest to chat this through?
253
79
1,372
149,188
New blog post: Questions about the future of AI A 6,000-word clusterfuck of considerations about economics, history, training, investment, and more. Thread of select questions below:
27
108
1,351
222,922
you either die a hero, or live long enough to get cooked in a @slatestarcodex parody blog post
22
34
1,351
61,661
Every time I fail a spaced repetition review for a card which I remember thinking was almost too trivial to write down, I become more convinced that everything I read without making cards for is a waste of time.
68
52
1,335
228,890
Tony Blair's single biggest piece of advice to Labour once they win the election:
105
222
1,250
356,691
New blog post where I explain why I disagree with this, and why I have slightly longer timelines to AGI than many of my guests. I think continual learning is a huge bottleneck to the usefulness of these models, and extended computer use may take years to sort out. L-nk below.
"Even if AI progress totally stalls, it's sufficiently easy to collect data on all these different white collar job tasks that we should expect to see them automated within the next 5 years."
88
90
1,335
912,254
People underrate how big a bottleneck inference compute will be. Especially if you have short timelines. There's currently about 10 million H100 equivalents in the world. By some estimates, human brain has the same FLOPS as an H100. So even if we could train an AGI that is as inference efficient as humans, we couldn't sustain a very large population of AIs. Not to mention that a large fraction of AI compute will continue to be used for training, not inference. And while AI compute has been growing 2.25x so far, by 2028, you'd be push against TSMC's overall wafer production limits, which grows 1.25x according to AI 2027 Compute Forecast. ht @EgeErdil2, @EpochAIResearch's "Can AI Scaling Continue Through 2030?", AI-2027 compute forecast
cue the @ohlennart laser eyes meme
81
154
1,333
304,715
Just $1 can help avert 10 years of farmed animal suffering. I decided to give $250,000 as a donation match to @farmkind_giving after learning about the outsized opportunities to help. FarmKind directs your contributions to the most effective charities in this area. Please consider contributing, even if it’s a small amount. Together, we can double each other's impact and give a total of $500,000. Use the link below to donate with my match. Bluntly, there are some listeners who are in a position to give much more. Given how neglected this topic is, one such person could singlehandedly change the game for 10s of billions of animals. If you’re considering donating $50k or more, please reach out directly to @Lewis_Bollard and his team by DMing him, or emailing andres@openphilanthropy.org
New episode w @Lewis_Bollard - a deep dive on the surprising economics of the meat industry. 0:00:00 – The astonishing efficiency of factory farming 0:07:18 – It was a mistake making this about diet 0:09:54 – Tech that’s sparing 100s of millions of animals/year 0:16:16 – Brainless chickens and higher welfare breeds 0:28:21 – $1 can prevent 10 years of animal suffering 0:37:26 – The situation in China and the developing world 0:41:41 – How the meat lobby got a lock on Congress 0:53:23 – Business structure of the meat industry 0:57:42 – Corporate campaigns are underrated Available on YouTube, Apple Podcasts, Spotify, etc (look up Dwarkesh Podcast).
79
161
1,286
589,116
I haven’t had you on my podcast because your takes don’t make much sense to me and don’t seem well thought out. But I’m happy to debate you. @ManifoldMarkets has been DMing you trying to setup a debate between us. I’ve agreed. Given your statement here, I assume you’re in?
Latent agenda reveal. Dwarkesh has repeatedly turned down my offers to appear on his pod over the past year. I thought it could bring balance to his guest lineup. He's clearly positioning himself as a mouthpiece for the AI Safetyist agenda/spreading EA propaganda.
86
20
1,226
1,046,989