Co-founder of @MechanizeWork Married to @natalia__coelho email: matthew at mechanize dot work

San Francisco, CA
Twin and adoption studies consistently show that parenting choices have minimal effects on a kid's eventual intelligence, personality, or happiness (except in cases of extreme neglect or abuse). This should revolutionize how we raise children, yet almost nobody knows or cares.
136
126
1,415
232,248
A new LLM truthfulness benchmark just dropped. (Context: Alabama in fact has a higher per capita GDP than Japan.)
38
66
1,163
249,431
There's been a lot of low quality GPT-4 speculation recently. So, here's a relatively informed GPT-4 speculation thread from an outsider who still doesn't know that much. 🧵
29
181
1,181
614,647
If you give people cash and they choose to spend it on a bunch of junk rather than on education or healthcare, one way to interpret that result is that marginal spending on education and healthcare is worth less than a bunch of junk.
The "cash can replace" a strong social safety net people taking a real L this morning.
24
58
888
56,530
I have no dislike for philosophers, but the profession did not prepare us well for AI. The field is full of muddled thinking, abysmal takes like the Chinese Room Argument, a focus on pointless vague inquiries over big picture questions, and is often detached from actual AI.
53
22
536
112,615
I wish people made their predictions falsifiable. Robin Hanson has been saying that the current AI boom will bust since at least 2016, but AI has rapidly gotten better over that entire time frame, with correspondingly more investment and attention. When can we say he was "wrong"?
Replying to @robinhanson
The current burst of AI activity will likely fade, as have many bursts before, before the future burst when AI actually takes over the world. Something else will be the "big thing" between future AI bursts.
15
13
453
28,837
I personally think $20 a month is cheap when the benefit is knowing whether a fundamental claim in your field is valid (and in this case, the claim is approximately not valid).
18
8
428
63,894
Why do so many people think that humans don't trade with animals because we're way more powerful than them, rather than because they can't talk or keep agreements? Do they think cats, mice, and ants really couldn't do anything useful for us if we could coordinate with them?
81
10
438
39,138
I currently think this open letter is quite bad, and possibly net harmful. The proposed policy appears vague and misguided. I want to explain some of my thoughts. 🧵 futureoflife.org/open-letter…
21
51
392
181,534
Why are some people still treating AGI as a thing that will at some point "be invented"? At this point, doesn't it seem pretty clear that AIs will just get continuously more general and capable with no clear finish line?
60
23
372
41,047
Most rankings of the top causes of death can be misleading, since they count someone dying at 20 the same as someone dying at 90. When you weight by life-years lost, you get this ranking (as of 2015). From ncbi.nlm.nih.gov/pmc/article…
16
53
350
How it feels to read stories about how an AGI can take over the world.
22
19
353
36,862
It's frustrating when people say "AI progress is too fast" while over 100,000 people still die from aging per day, with no sign of abating. It's like we're in a huge, deadly war and people say our leaders are rushing to agree to a peace settlement. No, they should go even faster.
51
19
330
79,450
Here's a line of reasoning for AI doom I've seen before that seems bad: 1. The first AGI will be able to end the world via nanotech 2. I can't explain exactly how it could do that 3. But (2) doesn't matter, because an AGI will be much smarter than me, and will figure it out
53
17
302
62,784
I graded GPT-4's responses on @bryan_caplan's economics midterm—the same one that ChatGPT got a D on—and it got an A. I don't think GPT-4 is at human-level yet across a wide range of tasks, but I'm feeling good about my bet right now. matthewbarnett.substack.com/…
10
35
300
97,440
I admit I'm confused why some people think there's a fundamental barrier deep learning still needs to break through before obtaining "real intelligence". I understand thinking that in 2021, but how could you say that after talking to GPT-4 for an hour?
76
10
309
144,170
A popular idea in AI risk literature until recently was the idea that AIs would very quickly go from below human-level to above human-level intelligence. As Nick Bostrom put it, "The train doesn't stop at Humanville Station. It's likely, rather, to swoosh right by."
28
19
295
98,211
I also used to think this, and it was one of the reasons why I had long AI timelines. But I changed my mind. Existing evidence suggests that technologies are getting adopted much faster now. And we know that ChatGPT was adopted very fast compared to e.g. electricity.
Every economist I know says that it takes 15 to 20 years before a new general purpose technology has a measurable effect on productivity. The delay is determined by how fast people learn to use it. So no, AI is not going to cause instant mass unemployment. It's going to displace jobs over time and make people more productive, just like every other technological revolution before that.
13
38
288
68,567
This is a good time to reflect on the "AI effect". Before a benchmark is solved, people often think we'll need "real AGI" to solve it. Then, afterwards, we realize the benchmark can be solved using mere tricks. Will this benchmark fall in the same way? Honestly, I'm not sure.🧵
1/10 Today we're launching FrontierMath, a benchmark for evaluating advanced mathematical reasoning in AI. We collaborated with 60+ leading mathematicians to create hundreds of original, exceptionally challenging math problems, of which current AI systems solve less than 2%.
19
33
330
127,849
I genuinely think "consciousness" is simply the modern, secular term for "soul". Both refer to unfalsifiable concepts used to determine who is in or out of our moral ingroup. Neither are empirical designations discovered through experiment, but socially constructed categories.
49
12
303
39,059
If or when effective anti-aging therapies are developed, I predict most people will sign up, including the guy I'm quote-tweeting. The rationalizations people come up with for aging and death are flimsier than a house of cards in a gusty wind.
I genuinely don't understand the desire to live to be 120 years old. We unlocked the secret to eternal life a few millennia ago: just get married and have babies. Then you live on through your kids.
23
19
290
25,701
Replying to @AlexGodofsky
Being an activist for a cause doesn't mean you need to support everything that helps the cause, including things that have large costs in other ways. I don't think going to this protest reveals a large inconsistency in Greta's behavior.
5
224
9,956
From my POV, I am trying to *save everyone's lives* by accelerating AI. My view is that AI will accelerate medical cures that could save the lives of billions. Delaying AI therefore risks killing billions of people. I am on the side of life, not death. I want us all to live.
55
15
252
22,614
At some point in the next 5 years, I expect people will create a giant AI-generated encyclopedia that has lower peak quality, but higher average quality than Wikipedia. This will potentially do to Wikipedia what Wikipedia did to Encyclopedia Britannica.
25
13
228
55,711
I want to highlight that @DKokotajlo has been polite and focused on object-level points in just about every discussion that I can remember with him, including when we vehemently disagreed. I appreciate this, as it seems like a surprisingly rare and undervalued personality trait.
8
5
253
6,729
I know "the Nazis were really bad" is not an interesting or original take. But I'm continuously shocked at how terrible they were. It's like each time I learn more about them, my opinion of them drops even further.
21
5
214
Something that surprised me last year regarding LLMs was their ability to do mathematics well. I now suspect that mathematics is not much harder for computers to understand than ordinary natural language documents. This has pretty interesting implications. 🧵
12
20
238
111,624
The foom debate in three parts. 1/3
6
47
231
I want to know how seriously to take this study. It suggests that dictators routinely lie about GDP data by large amounts. If true, it would indicate that the world is a lot poorer than the statistics show. (The paper has yet to be published.) archive.ph/GVCgW
19
30
228
It's interesting for me to see many replies to this tweet arguing that he's wrong. I personally think this is a banal philosophical thesis. What makes so many people think that silicon cannot host conscious minds in the same way that biology can?
David Chalmers says it is possible for an AI system to be conscious because the brain itself is a machine that produces consciousness, so we know this is possible in principle
69
3
224
20,652
In my opinion, currently progress in language models makes this picture look false. Right now, LLMs seem to be incrementally moving through the human range of abilities for various general intellectual tasks without any sudden cross-domain jumps in power.
27
8
229
24,915
Combining these assumptions, I estimate that the total training compute for GPT-4 will be between 2.54 billion petaFLOP to 130 billion petaFLOP, with a central estimate of 18 billion petaFLOP. For comparison, that's roughly 1-50 times more compute than PaLM.
4
43
228
53,435
OpenAI just updated their audio transcriber, Whisper. I just tried it out. It's very close to human-level in my test. You should consider using it as an alternative to human transcription. github.com/openai/whisper
3
26
230
Putting aside how interesting it would be, the dark forest hypothesis seems very weak. Why wouldn't hostile aliens just send space probes to every star system and monitor planets closely? Somehow they only care enough to eliminate "loud" competition?
This has supplanted New Chronology as the worldview I admire the inventiveness of the most without believing any of.
69
3
216
49,265
Question for AI pessimists: suppose an AI is released that is clearly better than the top human mathematicians at math. It can also write long, coherent books comparable to the best human authors. Three months pass and the world does not end. How do you update on p(doom)?
82
13
217
79,850
"Superintelligence" seems overrated. o3 is already quite intelligent: it can do math, write code, and understand research. Yet, most people would probably find greater value in a robot that cleans their room. What matters is super-useful AI, not necessarily superintelligent AI.
22
10
222
15,856
I'm frustrated by the negativity towards Anthropic on my feed today. Personally, I think they're doing great work. They're showing how to be responsible while swiftly advancing AI capabilities. Ironically, they're criticized for both of these things, but I appreciate both.
13
4
221
10,154
I think there is something true about @robinhanson's thesis that fear of AI is often just fear of the future. For example, here's @KatjaGrace sharing her worries about what may happen even if AI doesn't kill everyone.
32
20
204
27,866
"Alexey Guzey’s Theses on Sleep gained a lot of popularity and acclaim on LessWrong and among people I follow on social media, despite largely consisting of what I think were weak arguments and misleading claims." lesswrong.com/posts/sbcmACvB…
7
23
214
"I think my students can stop worrying that their hard-won skills and knowledge will be outstripped by an AI program anytime soon." Will Steve Landsburg put his money where his mouth is? I'm happy to bet him that an AI will score As on his exams before 2028 >75% of the time.
GPT4 gets a 0 on Steven Landsbrug's undergrad econ exam. But damn, Steven's questions are tricky. Not hard computationally but you really have to think like an economist! thebigquestions.com/2023/04/…
11
13
214
50,850
I wrote a LessWrong post about why I think some MIRI people (@ESYudkowsky, @So8res, and @robbensinger) should probably update on alignment being easier than they expected in light of the fact that LLMs seem to follow directions well and act morally. lesswrong.com/posts/i5kijcjF…
18
26
204
44,558
A reminder that Baumol's cost disease is poorly named. It's not a disease. It's a side effect of unequal productivity growth across sectors. The increasing cost of services means that we're richer on average, not poorer, than before.
2
21
208
We are offering a $500k base salary for this role. That's not total compensation: we're paying equity on top of the $500k. If you know any highly experienced software engineers who might be a good fit, please reach out. It's totally fine if they don't have any experience in ML.
We're hiring software engineers. $500k base.
18
6
207
55,964
Here's a prediction from February 2023 that we can now evaluate. Has the pace of AI progress over the past few years felt "intuitively nuts" to you? Personally, I don't think so. AI had a big moment with ChatGPT and GPT-4, but the 2 years since then felt mostly incremental to me.
A mental model I have of AI is it was roughly ~linear progress from 1960s-2010, then exponential 2010-2020s, then has started to display 'compounding exponential' properties in 2021/22 onwards. In other words, next few years will yield progress that intuitively feels nuts.
35
6
192
36,308
Replying to @eshear
Your rebuttal to decades of consistently replicated research about the role of genes in human behavior is to cite a single example where genes presumably played a very substantial role?
8
178
6,912
In my opinion, we appear close to achieving this milestone that @GaryMarcus described in 2014, which he described as a "Turing Test for the twenty-first century".
Gemini 1.5 Pro can perform highly-sophisticated understanding & reasoning tasks for different modalities, including video. 📹 When given a 44-minute silent Buster Keaton film, it can analyze various plot points, and even reason about small details that could easily be missed. ↓
15
15
189
37,858
The argument that the internet didn't impact GDP much because we've kept to a ~1.5% yearly per capita growth trend since 1990 seems weak to me. Perhaps without the internet, we would have had 0.5% yearly growth instead. How can you infer the counterfactual from the trend?
20
8
169
28,138
I outline what I currently consider to be the most plausible AI doom story here: lesswrong.com/posts/MnrQMLuE…
46
16
182
107,812
Many AI risk arguments focus on showing that AIs could take control in a sudden, violent takeover. But I think we're already going to be giving AIs control of our civilization by default. We're going to give up the keys voluntarily. A dramatic takeover event isn't necessary.
22
9
181
19,672
I think there’s sometimes a motte and bailey in these discussions. I think smarter-than-human AIs are clearly possible. But I’m skeptical of the premise that they will quickly turn into near-omnipotent gods with seemingly unlimited powers of persuasion and deception.
Saying "I don't believe in ASI" is just the most insane cope. Let's say Einstein-level intelligence truly is some sort of universal intelligence speed limit. What do you think 1000s of Einstein's thinking together thousands of times faster than humanly possible looks like?
19
6
175
14,999
I think some people underrate the possibility that we don't need to understand how neural networks work in order to align them. We manage to align humans and domesticated animals reasonably well, even though we don't fully understand how their brains work.
23
14
174
26,268
I think I've uncovered an error in @ESYudkowsky's book Inadequate Equilibria that undermines a key point in the book. See the full thread for details.🧵
8
7
175
53,610
.@bryan_caplan says he is not impressed by ChatGPT, as it scored a D on his labor economics exam. But does he expect the technology to improve? I'd be happy to bet him that language models will consistently earn A's on his exams before 2028. betonit.substack.com/p/chatg…
12
9
165
55,970
The first image is from @ESYudkowsky in 2016. I think this prediction is clearly becoming increasingly untenable. GPT-4 seems to have a fair degree of situational awareness, can pursue goals to help us, and yet doesn't resist shutdown by default.
37
5
173
69,977
I want to pre-register that I mostly agree with the scenario depicted in AI 2027 up until about 2029. I expect the essay's predictions to be falsified around 2029–2030 when our economy has not yet been ~fully automated. However, until that point, the essay appears reasonable.
9
2
171
16,817
The first thing to understand about FrontierMath is that it's genuinely extremely hard. Almost everyone on Earth would score approximately 0%, even if they're given a full day to solve *each* problem. For fun, here's what a few people on Reddit said after looking at the problems.
4
10
178
36,500
I sincerely wish for people to more frequently update their understanding of things like AI risk and AI takeoff as we get more info about the technology. I still see a lot of people stuck in frameworks that made sense in 2013 but not 2023. Please try harder.
9
19
163
36,815
On a basic level, I find it pretty suspicious that a large fraction of EA (definitely not all of it) has converged onto the position that the best way of ensuring we get to the vast, vibrant post-human future is to shut down the ~only technology capable of taking us there.
32
9
156
44,265
To those who think AGI will merely continue 2-6% GDP growth, would you say the same about a technology that allowed for extremely fast ordinary population growth, e.g. a machine that near-instantly duplicated humans for $30,000 each?
22
5
157
13,572
AGI would be a far more important innovation than room temperature superconductors. With AGI, people could have armies of servants that obey their every whim, and who are much smarter than them. R&D could be automated. It's not a close competition.
I can't help notice how the excitement over the mere *possibility* of an actual no-shit physical tech breakthrough makes all the years of hype around chatbots and game-playing robots look kinda shallow and forced by comparison. benlandautaylor.com/2023/05/…
17
1
154
22,401
While I appreciate this study, I'm also a bit worried its headline result is misleading—it only measures performance on a narrow set of software tasks. As of March 2025, AIs still can't handle 15-minute robotics or computer-use tasks, despite what the headline plot might suggest.
When will AI systems be able to carry out long projects independently? In new research, we find a kind of “Moore’s Law for AI agents”: the length of tasks that AIs can do is doubling about every 7 months.
8
13
153
19,780
The open letter proposes that we prohibit giant training runs, possibly by law, but explicitly allow algorithmic progress. This would create a "hardware overhang" in which a discontinuous capability increase becomes more likely if these constraints are ever lifted.
9
6
147
17,646
Replying to @rgblong
GPT-3.5 was reportedly finished training in early 2022. Since I estimated that GPT-4 could be trained for up to 12 months, and they likely need to fine-tune and test it, my guess is that we're looking at a release in the early months of 2023, with maybe a median of March.
6
145
20,745
It's interesting that we actually got something like a 6 month pause that FLI was asking for in their open letter. Nothing in the last 12 months has meaningfully surpassed GPT-4. How much safety benefit did we get from this unforced pause?
27
13
142
41,368
Every consumer good has consumer surplus, so this explanation is too general to explain much about AI in particular. A better explanation for why AI isn't meaningfully showing up in GDP is that AI has simply had a relatively small impact on economic production so far.
Really interesting article. Why isn't the impact of AI showing up in GDP? Because most of the benefit accrues to consumers. To measure impact, they investigate how much people would *need to be paid to give up a good*, rather than what they pay for it.
7
5
145
14,891
Replying to @eshear
I'm not claiming that because (1) I'm talking about intelligence, personality, and happiness, not trained skills like tennis, (2) this is about statistical regularities, not sweeping claims about every case, and (3) the unique environment makes counterfactuals difficult to infer.
5
136
3,752
If it were up to a vote, the public might also ban genetically modified foods. Thankfully, our institutions rely more on expert assessments of risk than general opinion. We should probably do the same for AGI.
The public continues to be very clear about what it thinks about AGI.
18
13
139
54,969
Some improvements we might start to see more in large language models within 2 years: - Explicit memory that will allow it to retrieve documents and read them before answering questions arxiv.org/abs/2112.04426
5
19
140
22,988
Replying to @LaraThurnherr
I feel like anyone who said "writing interesting stories" was nowhere near solved in January 2021 simply wasn't paying attention to recent progress.
8
141
6,557
I mostly don't think Gemini is impressive for Google. The very modest improvements over GPT-4, which finished pre-training in August 2022, suggest that Google is still underrating the importance of hardware, lacks crucial engineering talent, or both.
10
2
129
17,260
Sources: You can get the GDP of Alabama from the BEA, and divide it by the population to get $55,124. The World Bank shows the per capita GDP (PPP) of Japan at $45,572. Both of these figures are adjusted for cost of living and inflation (they're in 2022 international dollars).
5
129
10,617
Probably the most dubious assumption I made is that OpenAI will have enough high quality data to train their model compute-optimally. They've likely been working to scrape as much data from the internet as possible. But they may have hit a limit, e.g. see lesswrong.com/posts/6Fpvch8R…
2
9
137
24,838
One thing earlier futurists missed was that behavioral cloning is a lot easier than brain scanning and detailed simulation. I expect the first human mind uploads will be deep learning models fine-tuned on a person's behavioral data, without needing full neuron-level duplication.
24
7
139
13,247
With the algorithmic adjustment, the qualitative improvement from GPT-3 (vanilla) to GPT-4 is comparable to the improvement from GPT-2 to GPT-3. Since that was a rather big jump, I expect many will be stunned by GPT-4, especially those who expected strong diminishing returns.
2
10
139
28,854
I am quite skeptical of the concept of "human values" as it is typically used in many AI risk arguments. The concept seems to imply that humans basically all have the same values by virtue of their species membership, but this seems like an empirically unfounded theory.
16
12
133
10,211
Joseph Carlsmith estimated that the human brain uses approximately 10^15 FLOP/s. Over 30 years, that's about 10^24 FLOP. Language models exploded in popularity in the last year, timed almost exactly with the release of ML models trained using over 10^24 FLOP.
9
14
138
19,925
I recently criticized the calls to pause model scaling. However, my arguments were brief. Therefore, I thought it might be valuable to elaborate on my view that we should be cautious about slowing down AI progress. 🧵
5
29
134
45,925
In a blog post from 2020, Microsoft announced a new supercomputer for the exclusive purpose of training large ML models for OpenAI. They stated that "Compared with other machines listed on the TOP500 supercomputers in the world, it ranks in the top five". blogs.microsoft.com/ai/opena…
1
5
137
36,255
A lot of people still seem to have the impression that AGI will be useful by being a smart thing we keep inside a lab doing science, like a lone genius. I disagree. The main reason AGI is useful is because we can deploy billions of them to automate labor everywhere.
16
12
129
47,873
I opened a Manifold Market about whether GPT-4 will get the Monty *Fall* problem correct. manifold.markets/MatthewBarn…
22
6
133
34,516
In a 2021 discussion, both Paul Christiano and Eliezer Yudkowsky agreed that no AI would pass a hard 1-hour Turing Test until the "End Times", i.e. until after the world ended or after a huge economic acceleration. I suspect these predictions will look quite bad within 6 years.
17
7
131
13,295
Replying to @EgeErdil2
Potentially the result of framing. "Is it true that..." vs. "which is higher?"
3
125
7,950
It's crazy to me how some people still seem to think whole brain emulation has a >25% chance of coming before de novo AGI, even after GPT-3 and a decade of very slow progress on brain scanning/emulation. @robinhanson why? lesswrong.com/posts/mHqQxwKu…
8
15
126
One of my least charitable philosophical takes: I don’t see how non-consequentialist moral theories are anything more than the result of people not thinking very hard about how to actually take actions in the real world.
10
7
122
One of the most common arguments against AGI being near is the following take: AI has gone through many boom and bust cycles before in which people thought we were close, but we ended up being far. This boom will also bust. Ultimately, I find this argument quite weak. 🧵
7
10
125
46,502
I wonder how much higher birth rates would be if everyone were familiar with these research results.
11
120
9,469
Disagreement about AI timelines is often framed as a disagreement about the anticipated rate of future AI progress. However, I believe the real disagreement is often not about the rate of progress, but about the threshold required for AI to be transformative.
4
8
117
One reason why I'm skeptical of theoretical AI alignment research comes from looking at its empirical track record. For example, these screenshots are from a well-received post from 2019 by a smart alignment researcher. Did this agenda end up being useful at all for aligning AI?
15
6
116
27,411
The following are my thoughts on recent events involving FTX.🧵 Although I was completely unaware of any fraud until this week, I intend to fully return any money that I received, even indirectly, from victims of fraud or other financial crimes, including money I already spent.
8
6
114
My experience of watching other people vaguely informs me that this shift will become more common in the future as AGI draws nearer. Many people who are currently excited about AI seem like they'd become way more fearful if they thought actual superintelligence is arriving soon.
when my AGI timeline was 30-50 years vs when it became like 5 years
6
5
115
8,807
This new executive order doesn't appear to focus much at all on things EAs care about. I see nothing about responsible scaling, misalignment, or pausing AI. This is worth noting, since I think some had the impression that EA-style AI safety concerns had already become popular.
Politico: 'The [AI] order... [will] create a raft of new government offices and task forces and pave the way for the use of more AI in nearly every facet of life touched by the federal government, from health care to education, trade to housing and more' politico.com/news/2023/10/27…
13
6
114
40,855
My own basic calculations suggest that, given the potential for increased investment and hardware progress, we could very soon move through a large fraction of the remaining compute gap between the current frontier models and the literal amount of computation used by evolution.
6
12
125
10,919
This is not true. Yudkowsky wrote that you could create AGI by scaling compute. He just thought you could do it more easily (and more safely) if you understood exactly how intelligence works. lesswrong.com/posts/fKofLyep…
reminder that pre-gpt, yud spent years arguing that: 1) neural nets would never produce true ai 2) scaling compute would never produce true ai 3) true ai could only be produced by a genius programmer who imbued the machine with logic and reason, directly from his own mind
5
1
115
18,010
If your probability of AI-doom this century is higher than 90%, can you give me any single concrete prediction about the world before doom that you expect I (or other people) would disagree with?
42
5
112
48,587
I think it's generally better to state what you think is true, and likely to occur, rather than telling a story that you think is "good from a societal perspective". What matters is whether the tame version of the future is accurate, not whether society is ready to hear about it.
some of you need to touch grass lmaoo
4
4
117
11,535
I continue to think that a benign AI takeover is likely both inevitable and desirable. As AIs become more agentic and capable, they will gradually assume more responsibilities, gain legal rights, and earn social influence. There's no need for a dramatic coup, or treacherous turn.
37
5
114
14,506
I think it's a pretty clear mistake to focus on whether AI will "make everyone unemployed". Jobs can always be created by offering workers arbitrarily low wages. What matters is not whether people can find employment, but whether they'll receive a meaningful level of income.
12
9
115
5,752
I feel like a lot of people are assuming that LLM scaling over the next 4 years will resemble LLM scaling over the last 4 years, but that seems unlikely to me. GPT-2 was reportedly trained at a cost of $256/h. It's much easier to scale up fast if that's where you're starting.
9
7
108
33,682
Discontinuous AI progress is probably less safe than continuous, or incremental progress. That's because continuous progress is more predictable, and better allows us cope with challenges as they arise, compared to the alternative in which powerful AI suddenly arrives.
1
3
109
7,180