@Google DeepMind. On leave, Canada CIFAR AI Chair and Former Research Director, @VectorInst. Professor, @UofT (Statistics/CS). Views are my own.

London
Pinned Tweet
Just some personal thoughts now that the AI co-mathematician tech report is public... First, I'm so excited to see the co-mathematician team's hard work out for the world to preview. 💪+🦾=🔥 The team has built a system for mathematicians, with mathematicians. The fact it's now top of the FrontierMath leaderboard is a cherry on top, not the goal. Vibes and utility >> benchmarks. The system is currently being tested with a small number of professional mathematicians. It is not widely available, but I personally hope that, one day, we can get even more capable systems into the hands of all mathematicians. It's been a privilege working with this team at Google DeepMind since January. Props to @dhhzheng, @ADaviesAI, and @pushmeet for their leadership. Give them all a follow to not miss exciting upcoming work.
The future of Math is mathematicians and AI agents working together. Very pleased to introduce @GoogleDeepMind's AI co-mathematician: a multi-agent system designed to actively collaborate with human experts on open-ended research mathematics. Mathematicians testing the agent across areas as diverse as group theory, Hamiltonian systems, and algebraic combinatorics have reported impressive results. In autonomous mode evaluation on the rigorous FrontierMath Tier 4 problems, AI co-mathematician scored an unprecedented 48% — a new high score among all AI systems evaluated.
9
26
244
44,061
Someone's trying to scam my students into buying gift cards for "me". Unfortunately, they're too smart.
102
2,291
16,380
When you finish a PhD in computer science, they take you to a special room and explain that you must never use recursion in real life. Its only purpose is to make programming hard for undergrads.
When you finish a PhD in computer science, they take you to a special room and explain that you must never use recursion in real life. Its only purpose is to make programming hard for undergrads. 😂
27
409
5,862
399,235
Dear International Students in the US, Does the US really deserve you?
70
313
4,286
Deep learning.
name one thing this country gave to the world
69
151
2,818
309,126
Got promoted with tenure at University of Toronto :-) Thankful to my brilliant students, collaborators, and mentors. I can't think of a better place to have landed than Toronto. I took risks without existential fear and struck gold over and over with students in Stats and CS.
135
14
1,809
Hey @karpathy. If you are interested in an change of scene and pace, we’re looking for someone to tutor our kids in Python on Thursdays 3:15-4:15pm.
20
26
1,246
179,622
Feeling bad for my student who extended Shtarkov’s characterization of minimax rates to the adversarial setting, a problem open since early 2000s, and—due to inexperienced reviewers—it got only a poster based on 8,6,6,4 reviews. Should we pull it and send to IEEE info theory?
32
41
1,176
155,856
Maybe I am a physicist?
66
63
1,006
91,191
Too close to home? Junior researcher: I’m publishing papers at NeurIPS, my students are happy, but my chair says I’m “not impactful enough.” I don’t know what that means. Senior researcher: What did you tell them you accomplished last year? Junior: 3 top-tier papers, a new theoretical result on regret bounds, and an invited talk. Senior: And what did they hear? Junior: That I published 3 papers? Senior: They heard “I added to the publication count, but didn’t bring in grants or visibility for the department.” Junior: But regret bounds are impactful! Senior: To who? Junior: To… theorists? Senior: Your chair spends 20 minutes a month justifying your position to the dean. Can they use regret bounds to argue for funding? Junior: …probably not. Senior: What external metrics did your work move? Junior: One collaboration, one best paper award, and some citations. We don’t really track grant impact. Senior: There’s the problem. Half your contributions are invisible by design. Junior: But theory is necessary. The field would break without it. Senior: I believe you. The dean doesn’t care. Junior: That seems unfair. Senior: It is unfair. It’s also how academia works. Chairs get grilled on grants, rankings, and prestige, not the long-run stability of ML theory. Junior: So what should I do? Senior: Reframe. “Secured $500K in funding to explore foundational algorithms” sounds better than “proved a tighter regret bound.” Junior: But I don’t have that funding. Senior: Then you’re fighting academic reality without weapons. Junior: I don’t have time to write grants and still publish. Senior: Most junior faculty don’t. That’s the trap — you get judged on impact but don’t get impact resources. Junior: So what do I do? Senior: Acknowledge the game is rigged, then play it anyway. Junior: Meaning? Senior: Build collaborations that attract funding. Tie your theory to hot applied areas. Translate your results into language deans understand. Junior: That feels political. Senior: Everything above a certain level is political. The choice isn’t political vs pure. It’s visible vs irrelevant. Junior: What if my chair still doesn’t care? Senior: Then you’ve learned your chair doesn’t know how to evaluate theory. That’s a different problem — one you solve by finding a better environment. Junior: This is harder than just proving good theorems. Senior: Proving good theorems is table stakes. Surviving academia while proving good theorems — that’s the actual job.
Junior PM: I'm shipping everything on time, team loves me, but my manager says I'm "not strategic enough." I'm exhausted trying to figure out what that means. Senior PM: What did you tell him you accomplished last quarter? Junior PM: Delivered 5 features, reduced tech debt, improved team velocity by 15%. Senior PM: And what did he hear? Junior PM: That I delivered 5 features? Senior PM: He heard "I kept the team busy with stuff that doesn't move numbers I get asked about." Junior PM: But velocity improvement is strategic. Senior PM: To who? Junior PM: To... the team? Senior PM: Your manager spends 20 minutes a week with his director explaining why you exist. Can he use velocity to justify your headcount? Junior PM: I... probably not. Senior PM: What business metrics did those 5 features move? Junior PM: Three were tech debt, one was a sales request, one was compliance. We don't really measure impact on that stuff. Senior PM: There's your problem. Half your work is invisible by design. Junior PM: But that work was necessary. The platform would break without it. Senior PM: I believe you. Your manager's director doesn't care. Junior PM: That seems unfair. Senior PM: It is unfair. It's also how companies work. Your manager gets grilled about revenue and retention, not platform stability. Junior PM: So I should have said no to the tech debt? Senior PM: You probably couldn't. But you should have framed it differently. Junior PM: How? Senior PM: "Prevented $200K in potential downtime costs" sounds better than "reduced tech debt." Junior PM: But I don't have that number. Senior PM: Then you're fighting organizational reality without weapons. Junior PM: I don't have analytics support or time to instrument everything. Senior PM: Most junior PMs don't. That's the trap - you get judged on business impact but don't get business resources. Junior PM: So what do I do? Senior PM: Acknowledge the game is rigged, then play it anyway. Junior PM: Meaning? Senior PM: Make allies in sales and marketing. They have the numbers you need. Shadow customer calls. Connect your work to their goals. Junior PM: That feels political. Senior PM: Everything above a certain level is political. The choice isn't political vs pure. It's visible vs irrelevant. Junior PM: What if I try this and my manager still doesn't care? Senior PM: Then you learn your manager doesn't know how to evaluate PM work. That's a different problem - one you solve by finding a better manager. Junior PM: This is harder than just building good products. Senior PM: Building good products is table stakes. Surviving organizational dysfunction while building good products - that's the actual job.
21
72
1,095
161,259
I've not mentioned it yet on Twitter, but since my department announced it today, I'll do so too. I was promoted to Full Professor 🎉 (effective July 2024, in fact). Those of you who are professors will know how critical mentors and students are to our success, and this is very much the case in my case. I couldn't have asked for a better string of students (and postdocs). They have all found incredible success, and I'm happy to call them colleagues. In fact, one of my postdocs, Thibault Randrianarisoa, has just accepted a tenure track position at UTSC. Congrats, Thibault. In a week, I'm hoping to come back and congratulate yet another student.
106
9
1,065
52,913
Grad Students: Networking is critical for success in academia. But the real question is WHO to network with. I’ve known grad students who will spend time at conferences drinking artisanal kombucha with grad students from programs with no GPUs. Is that 1/
21
41
985
256,171
This is how you generate unbounded amounts of wealth for you and cronies when you can manipulate the stock market at will.
48
49
947
97,826
It's easy to reject papers. You can manufacture issues and sink them based on amorphous/vague notions such as novelty, impact, clarity, etc. Every paper has minor problems that you can amplify. It takes a lot more courage to argue to accept something.
13
96
898
When you finish a PhD in computer science, they take you to a special room and explain that you must never use recursion in real life. Its only purpose is to make programming hard for undergrads. 😂
When you finish a PhD in computer science, they take you to a special room and explain that you must never use recursion in real life. Its only purpose is to make programming hard for undergrads. 😂
23
95
830
NeurIPS is now a computer vision conference, it seems. Time to bust it up, because I'm not interested in any of that work, and the top 10 authors are all getting 18+ papers in in areas I don't care about.
29
41
811
321,614
I am inspired by my wife, Dr Gintare Karolina Dziugaite (@KDziugaite) who JUST TODAY defended her thesis at Cambridge, and, not only completed her PhD research in four years, but also brought two wonderful children into this world. Who else could complete a PhD between naps?!
20
19
793
5
71
776
So... now that we are giving online talks that are recorded, maybe we can invite junior scholars rather than listen to senior researchers give the same talk for the 20th time.
13
68
736
Dear PhD students now regretting taking offers at US schools: If you turned down PhD offers in Canada, but want to rethink that, email the professors who were trying to recruit you. They might be able to pull some strings. Your sane neighbor to the north, Canada
21
59
785
89,623
Professional update: I've been named Research Co-Director of the Vector Institute. This is an exciting opportunity but also a big responsibility: Vector has grown tremendously since 2017, with now over 700 researchers. One top priority: attracting next-gen top AI talent. 1/2
👏Exciting news! @roydanroy has been named the Vector Research Co-Director, leading the way in cutting-edge AI research along with current Research Director, Graham Taylor. Learn more about his appointment here: vectorinstitute.ai/dan-roy-n…
81
13
760
112,506
Replying to @AndrewYNg
Defeating misinformation, especially online.
19
11
748
What's the coolest/clearest application of graph neural networks?
54
96
744
Six years and dozen papers ago, I made a great decision! Happy Anniversary, @KDziugaite !
23
2
703
For many years, I enjoyed working on problems no one else was thinking about. There was no rush to publish and I could slowly make my way towards the correct formalizations. Like hiking in the backcountry. Now I feel like I'm hiking Yosemite Falls Trail in August.
8
52
638
🎉🙏👏🎉🙏👏 Big life announcement. I’m excited to announce that I’ll be joining the effort for XAI starting September 2023. I’m eager to seek the truth and uncover the true nature of the universe. More info to come.
27
6
632
218,507
If you write a paper about X because you read Y, give some love to Y. Don't bury that fact in some bullshit literature review. You might think we're competing but we're not.
5
44
598
I'm an AI researcher.
27
80
750
308,984
A theoretical computer scientist was being seriously considered for a prestigious math prize, when a mathematician on the committee asked whether the work was really mathematics. A very well-known mathematical physicist interrupted, "their theorems are just as useless as yours!"
10
76
588
Who else is getting bored of ML?
47
19
568
Careful mate... that foreigner wants your cookie! #StandUpToRacism
29
159
497
Einstein slept 12 hours a day, played video games for 4 hours a day, regularly procrastinated on Twitter for 3 hours a day, and sat on Zoom for 5 hours a day.
15
18
551
Just got word that a journal article has been accepted. Not usually something I would tweet, but this one is special, because I submitted it on Dec 15, 2011.
9
26
539
135 papers submitted, a record! Congratulations to me. Thanks in advance to all of you who will be reviewing these during your summer. Many are just undergrad ML course projects that I was too embarrassed to kill earlier. Sorry not sorry 🙇🤦‍♂️🪺
20
26
507
I thought the answer was obvious? They've illegally trained off o1, no? Meta risks too much doing that, surely. But this has always been the secret to the success of the "little guys" in this area: steal.
98
14
501
351,090
If you’ve been affected by Trump’s announcement, and you work in ML theory, consider sending a PostDoc application my way.
4
73
497
All hail the signal processing gods who knew all this stuff decades ahead of everyone. This idea of "discovery" is bullshit. Signal processing wasn't in the room when neural nets needed to be adapted to model language. And so attention was invented, again. Imagine I found an even EARLIER example of attention before the 90's example you raise. Does that undermine your claim that it was invented in the 90's by signal processing people. Nope. And same here. @docmilanfar : no idea why you've blocked me, but would it help if I were more sycophantic? 💋🫏
25
20
498
133,856
Dear ICML author, Had a good paper rejected? Reconsider submitting it to NeurIPS. The NeurIPS/ICML/etc review system is more wasteful than bitcoin. What should you do with your work? Submit your work to TMLR. The community is better at judging impact. Regards, Yours Truly
10
29
483
Just had a paper rejected by two workshops. First time that’s happened. ;) I guess we must really be onto something!!
12
14
474
Any optimization experts out there willing to weigh in? arxiv.org/abs/1811.03804
15
109
445
It was his generation in charge when tenure was decided for the current generation, so...
Turing Award winner Michael Stonebraker suggests that the current "diarrhea of papers" is not healthy for science #slowscience – Source: piped.video/DJFKl_5JTnA?t=863
5
8
451
41,169
Saw this beauty in Cambridge, UK. When I looked closer, I thought, only in Cambridge.
7
65
433
Claim: Splitting a paper into TeX files for each section is a mistake, needlessly complicating, among other things, search (and replace). Convince me otherwise. (Unless you use Dropbox for version control, because then I don't care about your opinion. ;)
81
12
441
I'll believe this when all the grad students around me stop using PyTorch.
8
31
435
I'm happy to report that 25 of my 59 submissions to NeurIPS were accepted! ALL YOUR BASE ARE BELONG TO US U LOSERS WHO SUBMITTED << 59 PAPERS!
14
10
408
Switching to working on deep learning is like buying a Porsche.
25
17
405
We REALLY REALLY need a "Findings" for NeurIPS, ICLR, and ICML. 25,000 submissions at this year's NeurIPS represents extreme excess pressure. It takes valuable time away from legitimate new research. One question is how to administer it. I suggest that Findings go through a lightweight round focused on improving clarity, reeling in overselling, etc. An AC can then "sign off". Authors can always decline the opportunity, if they want to try for the next conference. The NeurIPS's proceedings are called "Advances in Neural Information Processing Systems". We could have "Findings in Neural Information Processing Systems", or to not trample on their brand, perhaps "Contributions to Neural Information Processing Systems". ICLR could have "Letters on Learning Representations." ICML could have "Machinations on Machine Learning."
23
36
425
81,001
New book on probabilistic programming on arXiv. I’m sure the authors @hyang144 @jwvdm @frankdonaldwood will welcome feedback. arxiv.org/abs/1809.10756
5
142
397
Oh my god. Watch this guy who thinks he's doing something good with AI.
Megvii spokesperson discuss their views on how facial recognition technology will change China
21
87
380
Tell me you don’t understand Monte Carlo variance without telling me you don’t understand Monte Carlo variance.
BREAKING: The Rubik’s cube world record has been broken at 3.13 seconds 🎉🤯
27
34
380
232,045
This is a huge development. I want to highlight the theoreticians behind the scene, because this paper represents the realization of the impact of years of careful theoretical research. It starts with Greg Yang (@TheGregYang) opening up research on the muP scaling and hyperparameter transfer in infinite-width models. Simultaneously infinite-depth scaling are studied by Boris Hanin (@BorisHanin), Mihai Nica (@MihaiCNica), Mufan Li (@mufan_li), and Soufiane Hayou (@hayou_soufiane), including in networks with residual connections. Then this builds further with the study of infinite-depth scalings and Transformers by Lorenzo Noci (@lorenzo_noci), Blake Bordelon (@blake__bordelon), Mufan, Chuning Li (@ChuningLi), Hamzat Chaudhuri (@hamzatchaudhry), Boris, and Cengiz (@CPehlevan) in at least 3-4 papers, in particular using the DMFT framework. My understanding is that the translation of these insights into this work was highly nontrivial and so congrats to Cerebras for seeing it through with this great team. I also think this work could serve as a wake up to those in industry who reacted to muP saying "yeah yeah yeah we ended up at effectively the same place through careful scrutiny". I’d love to know which labs landed here, if any. If not, it goes to show you cannot have everyone grinding code. You need fundamental research to fuel BIG leaps.
(1/7) @cerebras Paper drop: arxiv.org/abs/2505.01618 TLDR: We introduce CompleteP, which offers depth-wise hyperparameter (HP) transfer (Left), FLOP savings when training deep models (Middle), and a larger range of compute-efficient width/depth ratios (Right).  🧵 👇
7
48
393
54,156
I'm launching a new Twitter service. You tweet me the name of a paper you're going to cite and for what reason, and I'll respond with a better paper to cite, if one exists.
33
17
380
No no no no no no no no no. Thankfully, this advise was ignored by the authors. But this wide spread but unspoken belief is why NeurIPS/ICML/ICLR reviewing for empirical papers is totally broken.
24
22
343
149,871
Yikes. I'm holding my breath.
The entire dataset of 1.7M+ arXiv papers is now available on @kaggle. We can't wait to see what the machine learning community will do with it! blogs.cornell.edu/arxiv/2020…
11
26
354
Dear @JustinTrudeau @VectorInst @MILAMontreal @CIFAR_News, How can Canada lead in AI if we cannot even process visas for top AI researchers gives MONTHS of advance warning? Please RT.
8
107
350
I asked Toronto students how much compute they thought the average NeurIPS author used per paper and 25% of them thought it was over six GPU years. One of them thought it was 800 GPU years. Really not sure what to make of this (The real number is 1 week before the deadline)
12
8
356
My new approach to seminar invites in the time of zoom: I've been co-presenting with students. No extra costs (because zoom) and students get opportunity to talk at fantastic institutions. Two down, hopefully many more like this to go! I recommend sharing the love.
4
29
346
More than 8800 NeurIPS papers. Holy. Cow.
12
67
350
I'm excited to announce that I moving to Twitter to take over the Engagement team. I'd like to thank all my collaborators for making this possible. My first step will be to introduce the Edit button.
13
2
341
I will not be accepting AC roles going forward. They are a waste of my time. They are a waste of the field's time. In fact, conferences proceedings should be drop kicked in favor of some new system. Typical review quality is so poor, we are now a cargo cult.
24
49
342
Applying for a faculty position? Turn on google scholar. Don't have a website? Get one. Seriously, people, it's 2017.
14
72
295
ACM: We’d like to congratulate the three fathers of deep learning... Schmidhuber: Hold my bier.
1
30
337
CS majors 10 years ago: 98% for the love. CS majors 10 minutes ago: 98% for the money. Obvious explanation: population shift.
"Learn to code" they told us
16
13
334
48,516
You’re not going to be hearing about ICML from me. The world is too messed up right now. #BlackLivesMatter
1
11
327
Replying to @DimitrisPapail
What if... they're still stochastic parrots? (Equivalently, what if we're just stochastic parrots?)
20
3
336
11,115
Raise your hand if you’re barely keeping your shit together this term. 🖐
24
7
320
I think it's a mistake for the US not to think about accelerating global vaccination. Variants created by out of control spread are going to come back to haunt all of us.
13
15
304
It’s an honor to be named a Canada CIFAR AI Chair and to be part of a rapidly growing ecosystem centered around AI here in Canada. Whether you’re an up and coming AI researcher, or a hard working high schooler, you should set your sights on Canada.
As part of the Pan-Canadian AI Strategy, we are announcing an expansion of the Canada CIFAR AI Chairs program, bringing the total number of chairs to 46, from 29 announced last December. Meet Canada’s AI leaders: cifar.ca/spring-2019-ai-chai… #CIFARAI
29
20
323
Poster session? That’s not a poster session. THIS is a poster session.
4
15
308
The real problem with writing papers with more than 1 idea is that no one reads past the 1st idea and then you have to keep telling people... "no, we did that already, have you read section 5 or appendix J?".
14
20
304
Some thoughts on the recent OpenAI chaos.
32
30
298
134,534
It's time for arXiv to allow anonymous submission (anonymous to arXiv too), but allow the owner to claim ownership at future date (crypto).
15
95
306
Yeah, that's pretty much right on.
4
20
294
WILL ML CONFERENCES PLEASE STOP FORCING US TO PUT THE APPENDICES IN ANOTHER FILE.
9
11
300
Replying to @m7amaRamadan
Nice try, scammer!
2
272
Sometimes you only figure out what your paper is REALLY about 3 hours before the deadline.
9
4
280
28,915
I remember reading this paper when it first showed up an arXiv. (Was called l'arXiv back then.) Phenomenal work. I immediately recognized its importance and began working on its application to selling ads.
Oldies but goldies: Joseph Fourier, Théorie analytique de la chaleur, 1822. Introduces sines and cosines series as an approximation method and derived the heat equation PDE. en.wikipedia.org/wiki/Joseph… irphe.fr/~clanet/otherpaperf…
6
20
283
These numbers are so small it’s insulting.
Nobody has fully jailbroken our system yet, so we're upping the ante. We’re now offering $10K to the first person to pass all eight levels, and $20K to the first person to pass all eight levels with a universal jailbreak. Full details: hackerone.com/constitutional…
11
9
281
29,535
Happy Birthday to me! Cake by ⁦@KDziugaite⁩.
20
1
289
It's beginning.
I used ChatGPT to solve an open problem in convex optimization. *Part I* (1/N)
10
10
288
89,653
My wife beat me to receiving a 10 score at ICLR. I didn't even know those existed.
7
2
288
Not sure if this comparison is meant to elevate GPT3 or not.
11
17
268
My ICLR reviewers just responded to my rebuttal and they agree totally and love my revisions 8 8 8 and I won a billions dollars and why yes I would like a unicorn!!
5
7
273
I need a list of reputable venues for ML researcher. Anyone have one? Clearly it should have NeurIPS/ICML/ICRL/CVPR. Where I'm missing familiarity are journals I've never heard of and subfield conferences and journals.
46
22
280
Tell me you've never worked in industry, without telling me you've never worked in industry.
In the long-run, even cuts to STEM funding are very good. Top STEM researchers belong in industry, not academia.
7
13
275
25,755
I’m delighted to announce the delightful news that I can announce my delight and excitement at announcing delight and did I mention delight and wow NeurIPS OOH YEAH D-LIGHT!
7
5
277
Your SOTA code may only be SOTA for some random seeds. Nonsense or new reality? I suppose there are trivial ways to close the gap using restarts and validation data. arxiv.org/abs/2002.06305
21
51
276
Just arrived in Addis Ababa, a few days early! Looking forward to #ICLR2020 !!! Anyone else around?
12
7
282
We're way beyond the point that attending a single NeurIPS/ICML/etc is more costly than an Oculus/HoloLens + high speed fiber internet. I'd love to see a VR/AR conference for ML. Imagine the savings in time, pollution, etc.
16
32
269
A new undergrad AI club is asking me for paper recommendations for their journal club. Help me give them a semesters worth of ideas by responding to this message.
44
30
269
115,360
Future Nobel Laureate in Physics.
4
11
257
17,853
Missing real conferences...
7
6
271
Can anyone verify that ChatGPT wasn't trained on the exam it is taking here or in any of the 20 papers on similar topics? And how would you prove this to me?
Capabilities of GPT-4 on Medical Challenge Problems GPT-4, without any specialized prompt crafting, exceeds the passing score on USMLE by over 20 points and outperforms domain-expert models like Med-PaLM. arxiv.org/abs/2303.13375
28
17
256
128,021
Turned down AC chair invite for NeurIPS. With no child care, and no vaccines until June, I need a break.
2
2
261