Yu Bai · Apr 23, 2026 · 6:31 PM UTC

Yu Bai

Pinned Tweet

Yu Bai

@yubai01

Apr 23

🥔GPT-5.5 is here in Codex and ChatGPT. 🚀 Don’t want to keep saying “step change” with each release, but this time we feel it’s a pretty big one. It may be an inflection point for a lot of things down the road. Please try using this model in Codex for your coding and other professional use cases – Start with the same tasks as before, expect it to do more with less human in the loop, it will make a big difference over 5.4.

OpenAI

@OpenAI

Apr 23

Introducing GPT-5.5 A new class of intelligence for real work and powering agents, built to understand complex goals, use tools, check its work, and carry more tasks through to completion. It marks a new way of getting computer work done. Now available in ChatGPT and Codex.

11,929

Yu Bai · Aug 13, 2025 · 4:26 PM UTC

Yu Bai

@yubai01

13 Aug 2025

We adjusted GPT5 based on all of your valuable feedback -- separated Auto vs. Fast, made 4o available, increased rate limits and made Mini explicit (Mini is an insane model too!)

Sam Altman

@sama

13 Aug 2025

Updates to ChatGPT: You can now choose between “Auto”, “Fast”, and “Thinking” for GPT-5. Most users will want Auto, but the additional control will be useful for some people. Rate limits are now 3,000 messages/week with GPT-5 Thinking, and then extra capacity on GPT-5 Thinking mini after that limit. Context limit for GPT-5 Thinking is 196k tokens. We may have to update rate limits over time depending on usage. 4o is back in the model picker for all paid users by default. If we ever do deprecate it, we will give plenty of notice. Paid users also now have a “Show additional models” toggle in ChatGPT web settings which will add models like o3, 4.1, and GPT-5 Thinking mini. 4.5 is only available to Pro users—it costs a lot of GPUs. We are working on an update to GPT-5’s personality which should feel warmer than the current personality but not as annoying (to most users) as GPT-4o. However, one learning for us from the past few days is we really just need to get to a world with more per-user customization of model personality.

377

57,572

Yu Bai · May 13, 2024 · 6:20 PM UTC

Yu Bai

@yubai01

13 May 2024

📢 Life update: In the vibes of the announcement today I am thrilled to share that I have joined @OpenAI as a researcher! It's just my week 2 here but already amazed by so many things the team has achieved. Looking forward to learning more and contributing!

Liam Fedus

@LiamFedus

13 May 2024

GPT-4o is our new state-of-the-art frontier model. We’ve been testing a version on the LMSys arena as im-also-a-good-gpt2-chatbot 🙂. Here’s how it’s been doing.

346

79,304

Yu Bai · Sep 12, 2024 · 5:33 PM UTC

Yu Bai

@yubai01

12 Sep 2024

We used RL to train a much stronger reasoning model. Excited to have been part of this journey, and way to go!!!

OpenAI

@OpenAI

12 Sep 2024

We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond. These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. openai.com/index/introducing…

284

38,229

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

Thrilled to share our new work "Transformers as Statisticians👩‍🎓👨‍🎓" Unveiling a new mechanism "In-Context Algorithm Selection" for In-Context Learning (ICL) in LLMs/transformers. ++ A comprehensive theory for transformers to do ICL. arxiv.org/abs/2306.04637 Thread⬇️

219

82,606

Yu Bai · Mar 31, 2021 · 5:39 PM UTC

Yu Bai

@yubai01

31 Mar 2021

🚨 New blog post on Deep Learning Theory Beyond NTKs: Salesforce research blog: blog.einstein.ai/beyond-ntk/ offconvex: offconvex.org/2021/03/25/bey… An exposition of "escaping the NTK ball with stronger learning guarantees". Joint w/ @jasondeanlee @MinshuoC

176

Yu Bai · Aug 5, 2025 · 5:16 PM UTC

Yu Bai

@yubai01

5 Aug 2025

We released our first open-source language model since GPT-2! It was amazing how the entire team has came together in every stage of this work -- squeezing the absolute best performance, stress-testing and mitigating safety risks to a new standard, and overcoming many unforeseen challenges -- the model is finally here and we can't wait to see the amazing research you will all do on top of it!

OpenAI

@OpenAI

5 Aug 2025

Our open models are here. Both of them. openai.com/open-models

153

26,224

Yu Bai · Sep 22, 2023 · 4:20 PM UTC

Yu Bai

@yubai01

22 Sep 2023

Excited to share that "Transformers as Statisticians" will appear at #NeurIPS2023 as an oral! We have two other posters on learning with attention models and RL theory (thread may follow): arxiv.org/abs/2307.11353 arxiv.org/abs/2306.01243

Yu Bai

@yubai01

8 Jun 2023

140

21,417

Yu Bai · Nov 16, 2021 · 6:31 PM UTC

Yu Bai

@yubai01

16 Nov 2021

📜🆕New extended blog post on recent progresses in multi-agent RL theory (joint w/ Chi Jin): yubai.org/blog/marl_theory.h… We talk about "How does RL theory become different when it's multi-agent", and present the various recent developments and opportunities therein.

Recent Progresses in Multi-Agent RL Theory

Reinforcement learning (RL) has made substantial empirical progresses in solving hard AI challenges in the past few years. A big portion of these progresses—Go, Dota 2, Starcraft, economic simulati...

yubai.org

112

Yu Bai · Dec 10, 2023 · 7:07 PM UTC

Yu Bai

@yubai01

10 Dec 2023

Flying to #NeurIPS2023 now. Looking forward to meeting old and new friends, and talking about everything LLM / RL! "Transformers as Statisticians" oral would be in the Wed afternoon session.

10,536

Yu Bai · Aug 6, 2025 · 2:03 AM UTC

Yu Bai

@yubai01

6 Aug 2025

Surreal & beyond excited to have joined forces with @Song__Mei again -- more to come soon!

Song Mei

@Song__Mei

6 Aug 2025

I’m excited to start at OpenAI this May and help ship the oss model. More to come soon!

20,487

Yu Bai · Aug 7, 2025 · 5:42 PM UTC

Yu Bai

@yubai01

7 Aug 2025

Today is the day -- we are excited to bring gpt5 to you. Fortunate to have led several workstreams in GPT5 Thinking and Mini model training. Among many other improvements, with @Song__Mei @minyoung_huh @SebastienBubeck and co, we applied some unexpected but cool techniques to make the model smart, chatty, and a good model all-around. Also honored to have worked together with the crew @yanndubs @ElaineYaLe6 @christinahkim @ericmitchellai @michpokrass @max_a_schwarzer and everyone else, it was fun coming together and doing things! Let us know how you like or dislike it -- this will not be the last model we're gonna train.

OpenAI

@OpenAI

7 Aug 2025

GPT-5 is here. Rolling out to everyone starting today. openai.com/gpt-5/

Introducing GPT-5

16,485

Yu Bai · May 15, 2024 · 4:39 AM UTC

Yu Bai

@yubai01

15 May 2024

And as my journey at Salesforce Research @SFResearch comes to an end after 4.5 years, I can't help but feel so fortunate to have been a part of this amazing AI research team. Thanks @huan__wang @CaimingXiong @silviocinguetta @RichardSocher ++everyone for all the support ♥️

11,915

Yu Bai · Aug 22, 2025 · 3:57 PM UTC

Yu Bai

@yubai01

22 Aug 2025

Brightness = how many cells were reprogrammed by the protein. Left is unaltered; middle is existing protein; right is the new model-designed protein.

Boris Power

@BorisMPower

22 Aug 2025

At @OpenAI, we believe that AI can accelerate science and drug discovery. An exciting example is our work with @RetroBiosciences, where a custom model designed improved variants of the Nobel-prize winning Yamanaka proteins. Today we published a closer look at the breakthrough. ⬇️

7,758

Yu Bai · Jun 25, 2020 · 1:45 AM UTC

Yu Bai

@yubai01

25 Jun 2020

How do deep networks perform hierarchical learning? We theoretically show that networks with wide intermediate representations can express functions hierarchically, and be more sample efficient than "shallow learners" such as the NTK. arxiv.org/abs/2006.13436

Towards Understanding Hierarchical Learning: Benefits of Neural...

Deep neural networks can empirically perform efficient hierarchical learning, in which the layers learn useful representations of the data. However, how they make use of the intermediate...

arxiv.org

Yu Bai · Apr 25, 2022 · 6:03 PM UTC

Yu Bai

@yubai01

25 Apr 2022

#ICLR2022 We present CP-Gen, a modular approach for improving the efficiency (e.g. length, volume) with conformal prediction, by tuning prediction sets with more than one parameters. Paper: openreview.net/forum?id=Ht85… Poster (Monday 6:30pm PT): iclr.cc/virtual/2022/poster/…

Yu Bai · Sep 18, 2023 · 6:08 AM UTC

Yu Bai

@yubai01

18 Sep 2023

We also used ReLU attention to study the expressive power of transformers. It matches softmax in our small (gpt2 scale) experiment in the first paper below. arxiv.org/abs/2306.04637 arxiv.org/abs/2307.11353 Nice work @hoonkp @skornblith and co to get ReLU transformers in action!

Transformers as Statisticians: Provable In-Context Learning with...

Neural sequence models based on the transformer architecture have demonstrated remarkable \emph{in-context learning} (ICL) abilities, where they can perform new tasks when prompted with training...

arxiv.org

Aran Komatsuzaki

@arankomatsuzaki

18 Sep 2023

Replacing softmax with ReLU in Vision Transformers ReLU-attention has better compute-performance scaling than softmax-attention on Vision Transformers arxiv.org/abs/2309.08586

10,259

Yu Bai · Oct 26, 2022 · 7:56 PM UTC

Yu Bai

@yubai01

26 Oct 2022

📜 Our paper on efficient learning in Extensive-Form Games will appear at #NeurIPS2022 as an Oral Presentation! 🔗 Paper: arxiv.org/abs/2205.15294 📢 Poster: neurips.cc/virtual/2022/post… Joint work with @chijinML @WispyMay Ziang Song @tianchengyu14 🧵1/

Yu Bai · Dec 13, 2023 · 5:51 AM UTC

Yu Bai

@yubai01

13 Dec 2023

CoT is a nice name! And excited that "Transformers as Statisticians" oral will be in this "CoT/Reasoning" session (Wed)--Being a statistician probably does mean that you're good at reasoning 😃

Denny Zhou

@denny_zhou

13 Dec 2023

In NeurIPS 2023, there is a section “CoT/Reasoning”. When preparing our CoT paper, I kicked off a discussion on the title. Different names were proposed, like stream of thought (Jason), train of thought (Dale), chain of thought (Dale). Finally I decided to choose “chain of thought”. Happy to see the name is liked by the community and popularized. :)

6,852

Yu Bai · Feb 8, 2021 · 9:14 PM UTC

Yu Bai

@yubai01

8 Feb 2021

New preprint on offline RL: arxiv.org/pdf/2102.01748.pdf * A variance reduction algorithm for offline RL * Optimal horizon dependence: O(H^2/d_m) sample complexity on time-homogeneous MDPs Joint w/ Ming Yin (@MingYin_0312) and Yu-Xiang Wang

Yu Bai · Jul 20, 2021 · 6:24 PM UTC

Yu Bai

@yubai01

20 Jul 2021

Check out our #ICML2021 paper---We theoretically analyze calibration, and show that over-confident prediction happens for well-specified logistic regression too, not just on large NNs! Paper: arxiv.org/abs/2102.07856 Poster: icml.cc/virtual/2021/poster/…, Wed 9am PT 1/4

Yu Bai · Aug 4, 2023 · 5:05 PM UTC

Yu Bai

@yubai01

4 Aug 2023

Congrats on the ICML best paper @GoogleDeepMind @misovalko @Tdash_Koz & team! Wraps a "trilogy" on learning Extensive-Form Games: * Our #ICML2022 paper, which first got O(X) (tight): arxiv.org/abs/2202.01752 * Their #NeurIPS2021 paper which got O(X^2): arxiv.org/abs/2106.06279

Near-Optimal Learning of Extensive-Form Games with Imperfect Information

This paper resolves the open question of designing near-optimal algorithms for learning imperfect-information extensive-form games from bandit feedback. We present the first line of algorithms...

arxiv.org

Demis Hassabis

@demishassabis

4 Aug 2023

Congrats to @GoogleDeepMind’s Remi Munos, @misovalko, & team on the Outstanding Paper Award at @ICMLConf! “Adapting to game trees in zero-sum imperfect information games” helps answer: how do you make the best move in a game w/ only partial info? Paper: openreview.net/pdf?id=O1j4uF…

9,519

Yu Bai · Jul 29, 2023 · 9:03 PM UTC

Yu Bai

@yubai01

29 Jul 2023

I'll be presenting "Transformers as Statisticians" at the ES-FOMO workshop at #ICML2023, at 1:00pm HT. See you there! Workshop website: icml.cc/virtual/2023/worksho…

Yu Bai

@yubai01

8 Jun 2023

7,261

Yu Bai · Sep 3, 2019 · 8:41 PM UTC

Yu Bai

@yubai01

3 Sep 2019

Our paper on low switching cost RL (arxiv.org/abs/1905.12849) has been accepted at @NeurIPSConf 2019. We showed that efficient PAC exploration can be achieved by switching the policy only logarithmically many times. Congrats Tengyang, @nanjiang_cs, and Yu-Xiang!

Yu Bai · Oct 18, 2024 · 11:55 PM UTC

Yu Bai

@yubai01

18 Oct 2024

TLDR: Attention sink/massive tokens emerge in LLMs, simply because most heads need to be * Active for some input sequences; * "Dormant" for others. Started as a fun collab during my time @SFResearch, huge shoutout to @TianyuGuo0505 @druv_pai @Song__Mei +co for the amazing work!

Song Mei

@Song__Mei

18 Oct 2024

Many LLMs, e.g., GPT2 and Llama, exhibit a fascinating attention sink phenomenon: attention weights often concentrate on the first token. We studied the training dynamics of toy models to demystify the sink formation mechanisms in LLMs. With fantastic @TianyuGuo0505 , @druv_pai , @yubai01 , @JiantaoJ , and Mike Jordan! ArXiv link: arxiv.org/abs/2410.13835 In detail: Practitioners have consistently found three extreme-token phenomena in LLMs: attention sinks, value-state drains, and residual-state peaks. They often cause trouble in LLM inference and quantization. To understand them, we developed the Bigram-Backcopy task and analyzed a single-layer transformer, revealing two key mechanisms: • Active-dormant mechanism: The attention sink represents the dormant phase of an attention head. • Mutual reinforcement mechanism: Attention sinks and value-state drains mutually reinforce during training. All results can transfer to LLMs! • Llama 2 has a “coding head” that is dormant given Wikipedia texts. • OMLo’s training dynamics closely match the theory and the toy model. We also found that replacing SoftMax attention with ReLU attention can mitigate the extreme-token phenomenon.

2,970

Yu Bai · Nov 28, 2022 · 5:22 AM UTC

Yu Bai

@yubai01

28 Nov 2022

Attending #NeurIPS2022 from Mon Evening -> Sat, and presenting 4 papers (1 oral + 3 posters) on multi-agent RL, games, and deep learning theory. I will also be at Salesforce's booth Tuesday afternoon, starting 2:45pm. Let me know if you want to chat!

Yu Bai · Apr 29, 2020 · 6:24 PM UTC

Yu Bai

@yubai01

29 Apr 2020

The AI Economist: Using multi-agent RL to simulate complex economic systems, guide policy designs, and improve social equality. Impressive work by colleagues @StephanZheng @alexrtrott and all at @SFResearch!

Richard Socher

@RichardSocher

29 Apr 2020

Excited to introduce the AI Economist: Extends ideas from Reinforcement Learning for tackling inequality through learned tax policy design. The framework optimizes productivity and equality. Blog: blog.einstein.ai/the-ai-econ… Paper: arxiv.org/abs/2004.13332 Q&A: salesforce.com/company/news-…

Yu Bai · Oct 4, 2019 · 5:37 PM UTC

Yu Bai

@yubai01

4 Oct 2019

Can wide neural nets be systematically analyzed beyond the kernel / linearized regime? Our recent work shows that wide NNs can couple with higher-order (e.g. quadratic) submodels and genearlize better than the linearized ones! arxiv.org/abs/1910.01619 (joint w/ @jasondeanlee)

Yu Bai · Dec 20, 2024 · 6:37 PM UTC

Yu Bai

@yubai01

20 Dec 2024

Besides ~saturating AIME, o3-mini is also the first to consistently solve some of the hard math questions in my own "test set" -- have to update that as well 🤣 Congrats @ren_hongyu @shengjia_zhao @_kevinlu + co!

2,223

Yu Bai · Jul 24, 2023 · 5:59 PM UTC

Yu Bai

@yubai01

24 Jul 2023

En route to #ICML2023 ✈️🌴. Let's chat about LLMs / in-context learning, (multi-agent) RL, and their theories. You can also find me at our posters and workshop papers:

4,170

Yu Bai · Mar 7, 2025 · 11:37 PM UTC

Yu Bai

@yubai01

7 Mar 2025

Congrats @LesterMackey!! still remember all the fun stuff at Stanford Stats 300 class and the stats ML reading group -- influenced me and so many of the next generation of statisticians

COPSS @COPSSNews

7 Mar 2025

🙌🎉Our 2025 recipient of the COPSS Presidents' Award, is Lester Mackey! This award is given annually to a young member of the statistical community in recognition of outstanding contributions to the profession of statistics.

4,439

Yu Bai · Jul 18, 2024 · 6:44 PM UTC

Yu Bai

@yubai01

18 Jul 2024

GPT-4o mini is out!

OpenAI Developers

@OpenAIDevs

18 Jul 2024

Introducing GPT-4o mini! It’s our most intelligent and affordable small model, available today in the API. GPT-4o mini is significantly smarter and cheaper than GPT-3.5 Turbo. openai.com/index/gpt-4o-mini…

4,287

Yu Bai · Apr 16, 2025 · 6:29 PM UTC

Yu Bai

@yubai01

16 Apr 2025

We have a better reasoning model again! I continue to get amazed by how much more the model gains by unlocking more tool use and post trained on a better stack.

OpenAI

@OpenAI

16 Apr 2025

Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation.

2,556

Yu Bai · Dec 6, 2019 · 9:29 PM UTC

Yu Bai

@yubai01

6 Dec 2019

Excited to be attending #NeurIPS2019 at Vancouver next week!

Yu Bai · Jun 30, 2023 · 11:18 PM UTC

Yu Bai

@yubai01

30 Jun 2023

Happy to be selected as an expert reviewer for @TmlrOrg ! Time to send in a submission for earning that expert certificate :)

Hugo Larochelle

@hugo_larochelle

30 Jun 2023

We have just finalized our first selection of TMLR Expert Reviewers. These are reviewers who have done particularly exemplary work in evaluating TMLR submissions. See the following page for details and the list of reviewers: openreview.net/group?id=TMLR…

3,200

Yu Bai · Jul 14, 2020 · 6:29 PM UTC

Yu Bai

@yubai01

14 Jul 2020

Excited to present our paper "Provable Self-Play Algorithms for Competitive Reinforcement Learning" at #ICML2020! Talk: Wednesday (July 15) 9am PT / 10pm PT Paper: arxiv.org/abs/2002.04017 Poster: icml.cc/virtual/2020/poster/… Joint work with Chi Jin. 1/2

Yu Bai · Apr 11, 2024 · 4:15 PM UTC

Yu Bai

@yubai01

11 Apr 2024

Check out NPO, a simple objective for LLM unlearning.

Song Mei

@Song__Mei

11 Apr 2024

LLM unlearning was mostly based on variants of gradient ascent (GA), susceptible to catastrophic forgetting. We propose Negative Preference Optimization (NPO), demonstrating efficient unlearning on TOFU benchmark. w/ @RuiqiZhang0614 @ Licong Lin, @yubai01. arxiv.org/abs/2404.05868

5,878

Yu Bai · Apr 8, 2024 · 7:31 PM UTC

Yu Bai

@yubai01

8 Apr 2024

Exciting opportunity for working with Song on LLMs!

Song Mei

@Song__Mei

8 Apr 2024

My group at Berkeley Stats and EECS has a postdoc opening in the theoretical (e.g., scaling laws, watermark) and empirical aspects (e.g., efficiency, safety, alignment) of LLMs or diffusion models. Send me an email with your CV if interested!

4,521

Yu Bai · May 24, 2022 · 5:22 AM UTC

Yu Bai

@yubai01

24 May 2022

Looking forward to this tomorrow! Thanks for organizing @CsabaSzepesvari @neu_rips @CiaraPikeBurke

Csaba Szepesvari @CsabaSzepesvari

24 May 2022

Thinking of scaling up multiagent RL to a large number of agents? Provably? Choose your equilibrium concept right and you may be rewarded! Yu Bai will tell us tomorrow how! For details see tinyurl.com/375x6j6b

Yu Bai · Jul 23, 2023 · 3:19 PM UTC

Yu Bai

@yubai01

23 Jul 2023

Flying to #ICML2023 tomorrow. Ping me if you'd like to chat!

2,944

Yu Bai · Feb 3, 2025 · 3:37 AM UTC

Yu Bai

@yubai01

3 Feb 2025

Lol don't mind at all if deep research gets pass me in 2025! @EdwardSun0909 it's on you :)

Quanquan Gu

@QuanquanGu

3 Feb 2025

PhD experts? 🤣🤣 Unless they can perform at @yubai01’s level, they’re irrelevant to the machine learning theory community.

6,160

Yu Bai · Dec 8, 2020 · 6:04 AM UTC

Yu Bai

@yubai01

8 Dec 2020

Appearing at #NeurIPS2020! Come to our poster session at Tuesday 9-11am PT to have some fun with NTKs, shallow Taylorized models, and better sample complexity than all these via neural hierarchical learning. w/ @MinshuoC @jasondeanlee ++ neurips.cc/virtual/2020/prot…

Yu Bai

@yubai01

25 Jun 2020

Yu Bai · Apr 28, 2020 · 6:16 PM UTC

Yu Bai

@yubai01

28 Apr 2020

Come chat with us about our Beyond Linearization paper and more! ICLR poster session today 10am - 12pm and 1 - 3pm PDT: iclr.cc/virtual/poster_rkllG… Paper: openreview.net/forum?id=rkll…

Yu Bai

@yubai01

20 Dec 2019

Our Beyond Linearization paper is accepted at #ICLR2020 ! openreview.net/forum?id=rkll…

Yu Bai · Jun 20, 2022 · 4:52 PM UTC

Yu Bai

@yubai01

20 Jun 2022

In new paper led by @EshaanNichani, we utilize the spectral structure + higher-order "QuadNTK" approximation to show benefit of "After NTK" learning.

Eshaan Nichani @EshaanNichani

20 Jun 2022

What happens “after NTK” in wide neural nets, and how does it improve over the NTK? Excited to announce a new paper with @yubai01 and @jasondeanlee! arxiv.org/abs/2206.03688 A thread on the main takeaways below: (1/9)

Yu Bai · Dec 20, 2019 · 6:43 PM UTC

Yu Bai

@yubai01

20 Dec 2019

Our Beyond Linearization paper is accepted at #ICLR2020 ! openreview.net/forum?id=rkll…

Beyond Linearization: On Quadratic and Higher-Order Approximation...

Wide neural networks can escape the NTK regime and couple with quadratic models, with provably nice optimization landscape and better generalization.

openreview.net

Yu Bai

@yubai01

4 Oct 2019

Yu Bai · Oct 22, 2022 · 5:22 PM UTC

Yu Bai

@yubai01

22 Oct 2022

Check out our new work for efficiently learning "rationalizable equilibria" in multiplayer games---Strategies that are both approximate CE/CCE, and supported on rationalizable actions.

Chi Jin @chijinML

22 Oct 2022

Replying to @chijinML

We are excited to announce our recent work arxiv.org/abs/2210.11402 with @YuanhaoWang3, Dingwen Kong, @yubai01, which presents new algorithms and the first sample-efficient guarantees for learning rationalizable equilibria.

Yu Bai · Mar 31, 2025 · 8:41 PM UTC

Yu Bai

@yubai01

31 Mar 2025

Was always hoping we could do this, and we are finally doing it!

Sam Altman

@sama

31 Mar 2025

TL;DR: we are excited to release a powerful new open-weight language model with reasoning in the coming months, and we want to talk to devs about how to make it maximally useful: openai.com/open-model-feedba… we are excited to make this a very, very good model! __ we are planning to release our first open-weigh language model since GPT-2. we’ve been thinking about this for a long time but other priorities took precedence. now it feels important to do. before release, we will evaluate this model according out our preparedness framework, like we would for any other model. and we will do extra work given that we know this model will be modified post-release. we still have some decisions to make, so we are hosting developer events to gather feedback and later play with early prototypes. we’ll start in SF in a couple of weeks followed by sessions in europe and APAC. if you are interested in joining, please sign up at the link above. we’re excited to see what developers build and how large companies and governments use it where they prefer to run a model themselves.

2,320

Yu Bai · Aug 11, 2020 · 12:18 AM UTC

Yu Bai

@yubai01

11 Aug 2020

Our annual AI research grant is now open for applications!

Salesforce AI Research

@SFResearch

10 Aug 2020

Announcing the Third Annual AI Research Grant! For more details and how to apply: Blog: blog.einstein.ai/announcing-… Website: einstein.ai/outreach/grants Good luck to our future applicants!

Yu Bai · Oct 12, 2021 · 5:37 PM UTC

Yu Bai

@yubai01

12 Oct 2021

🆕"When Can We Learn General-Sum Markov Games with a Large Number of Players Sample-Efficiently?" arxiv.org/abs/2110.04184 We theoretically study what RL can learn in multi-player general-sum MGs without exp(# players) samples. Joint w/ Ziang Song (Peking U.) & @WispyMay. 🧵

Yu Bai · Nov 6, 2019 · 12:27 AM UTC

Yu Bai

@yubai01

6 Nov 2019

Will be attending this workshop Thu - Fri. Looking forward to it!

Sanjeev Arora

@prfsanjeevarora

5 Nov 2019

2-day workshop "New Directions in Reinforcement Learning and Control" @the_IAS in Princeton Nov 7-8. Schedule math.ias.edu/ndrlc and livestream here ias.edu/livestream .

Yu Bai · Dec 8, 2020 · 4:52 PM UTC

Yu Bai

@yubai01

8 Dec 2020

#NeurIPS2020 What is the optimal algorithm for multi-agent reinforcement learning in zero-sum Markov games? We present "Near-Optimal Reinforcement Learning via Self-Play" Paper: arxiv.org/abs/2006.12007 Poster session: Tuesday 9-11am PT Joint w/ Chi Jin, Tiancheng Yu.

Yu Bai · Jul 19, 2025 · 3:19 PM UTC

Yu Bai

@yubai01

19 Jul 2025

Congrats @alexwei_ @SherylHsu02 @polynoamial !! This is a crazy result.

Alexander Wei

@alexwei_

19 Jul 2025

Replying to @alexwei_

9/N Still—this underscores how fast AI has advanced in recent years. In 2021, my PhD advisor @JacobSteinhardt had me forecast AI math progress by July 2025. I predicted 30% on the MATH benchmark (and thought everyone else was too optimistic). Instead, we have IMO gold.

2,372

Yu Bai · Apr 15, 2025 · 5:18 AM UTC

Yu Bai

@yubai01

15 Apr 2025

I think "halftime" is such a nice framing -- I had a similar feeling when sitting in Ilya's test-of-time talk at NeurIPS 2024. It felt like major pieces of the puzzle have already come together. Evals--often imaginations of what models can do---lead the way of what we can train them to do.

Shunyu Yao @ShunyuYao12

14 Apr 2025

I finally wrote another blogpost: ysymyth.github.io/The-Second… AI just keeps getting better over time, but NOW is a special moment that i call “the halftime”. Before it, training > eval. After it, eval > training. The reason: RL finally works. Lmk ur feedback so I’ll polish it.

2,423

Yu Bai · Jun 11, 2025 · 5:45 AM UTC

Yu Bai

@yubai01

11 Jun 2025

🚀

Sam Altman

@sama

10 Jun 2025

we are going to take a little more time with our open-weights model, i.e. expect it later this summer but not june. our research team did something unexpected and quite amazing and we think it will be very very worth the wait, but needs a bit longer.

1,831

Yu Bai · Apr 26, 2022 · 8:53 PM UTC

Yu Bai

@yubai01

26 Apr 2022

#ICLR2022 We present provably sample-efficient algorithms for multi-agent RL with large # players, without exp(# players) blowup! Poster session today (Tue 6:30 - 8:30pm PT): iclr.cc/virtual/2022/poster/… Paper👇

Yu Bai

@yubai01

12 Oct 2021

Yu Bai · Dec 3, 2021 · 6:01 PM UTC

Yu Bai

@yubai01

3 Dec 2021

Welcome to check out our new AI Residency Program!

Salesforce AI Research

@SFResearch

2 Dec 2021

Our new AI Residency Program aims to foster the next generation of AI researchers. Our program gives candidates real-world experience and makes them more qualified for top PhD programs. Applications close January 3, 2022: bit.ly/AIResJobAppTwitter

Yu Bai · Aug 6, 2025 · 5:17 PM UTC

Yu Bai

@yubai01

6 Aug 2025

hm what would that be?

OpenAI

@OpenAI

6 Aug 2025

LIVE5TREAM THURSDAY 10AM PT

2,145

Yu Bai · Nov 18, 2024 · 4:25 AM UTC

Yu Bai

@yubai01

18 Nov 2024

Great summer research intern opportunity!

Huan Wang @huan__wang

17 Nov 2024

We're hiring AI Research Interns for Summer 2025! Spend 3 months with us working on AI Agents, LLMs, Reasoning, Planning & more—with a focus on publishing high-quality academic papers. If you have a strong publication record, apply or DM me! #researchpaper #JobOpening #intern

3,538

Yu Bai · Oct 23, 2019 · 11:33 PM UTC

Yu Bai

@yubai01

23 Oct 2019

Check out Song's blog for more nice stuff on statistical physics <-> theoretical ML.

Gabriel Peyré

@gabrielpeyre

23 Oct 2019

Very nice blog post from Song Mei on the replica method from statistical physics. meisong541.github.io/jekyll/…

Yu Bai · Aug 8, 2025 · 6:01 PM UTC

Yu Bai

@yubai01

8 Aug 2025

Congrats @yanndubs !! Back to keeping the capability flywheel flying. 🛞

Yann Dubois

@yanndubs

8 Aug 2025

🔥 So excited to share GPT-5! For thinking mode and API models, we’ve improved performance across key: - Axes: factuality, steerability, long-context performance, efficiency - Domains: coding, writing, healthcare But we still have so many ideas for improvement, and as @SebastienBubeck mentioned, we also discovered a new capability flywheel, where GPT-N can help improve GPT-N+1. I can’t wait to see how GPT-6 will pan out! Now, back to crafting! cc @michpokrass @ericmitchellai @max_a_schwarzer @markchen90 @sama @OpenAI

1,850

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

Curious exp: A single transformer (TF_alg_select) can simultaneously match Bayes optimal in-context predictions on two tasks (noisy linear models with different noise levels). Those two tasks required different optimal algorithms (ridge regression with different \lambda's)!

1,235

Yu Bai · Jul 20, 2021 · 9:05 PM UTC

Yu Bai

@yubai01

20 Jul 2021

Welcome to check out our work!

Huan Wang @huan__wang

20 Jul 2021

If you are attending ICML @icmlconf this week, you are most welcome to check out some recent work from our team: #ICML2021 @SFResearch

Yu Bai · Dec 30, 2024 · 7:39 PM UTC

Yu Bai

@yubai01

30 Dec 2024

Replying to @_aidan_clark_

Bad local minima were studied a lot, e.g. Auer et al. 1995 "Exponentially many bad local minima for single neurons": papers.nips.cc/paper_files/p… tho the bad example there is clearly contrived, and the authors did not explicitly draw an implication like "NNs are bad" based on it

337

Yu Bai · Feb 28, 2025 · 9:41 PM UTC

Yu Bai

@yubai01

28 Feb 2025

Replying to @EdwardSun0909

Congrats!! When is Deep Research's thesis defense?

2,761

Yu Bai · Aug 6, 2024 · 3:04 AM UTC

Yu Bai

@yubai01

6 Aug 2024

Replying to @johnschulman2

It's been an honor to have been colleague with you and wished it could be longer. Thank you and all the best!

965

Yu Bai · Dec 12, 2024 · 9:38 PM UTC

Yu Bai

@yubai01

12 Dec 2024

That's the poster style we all need! 🤣

Simon Zhai @simon_zhai

12 Dec 2024

A huge thanks for everyone who came to the poster session. Posting this to whoever missed the jokes, comments & suggestions are more than welcome.

1,879

Yu Bai · Aug 14, 2025 · 9:01 PM UTC

Yu Bai

@yubai01

14 Aug 2025

Replying to @sughanthans1

It should be much better on coding and a little better across the board!

340

Yu Bai · Jun 14, 2022 · 5:20 AM UTC

Yu Bai

@yubai01

14 Jun 2022

Nice 🧵 by @yuxiangw_cs on low-switching cost RL (aka deployment efficiency). It's a practically relevant setting in between offline RL and "truly" online RL, with many exciting progresses and open questions for both deep RL and RL theory!

Yu-Xiang Wang

@yuxiangw_cs

14 Jun 2022

Online RL guarantees good exploration but has limited applicability (due to safety / legal concerns for online trials-and-errors). Offline RL (aka Batch RL) shows great promise but requires strong assumptions on logged data. Is there anything in between? 1/7

Yu Bai · Mar 31, 2021 · 5:47 PM UTC

Yu Bai

@yubai01

31 Mar 2021

Acknowledgement: Thanks @prfsanjeevarora for hosting offconvex and the many helps! Thanks to other co-authors of our paper @tourzhao @huan__wang @CaimingXiong @RichardSocher

Yu Bai · Jun 12, 2025 · 10:54 PM UTC

Yu Bai

@yubai01

12 Jun 2025

Replying to @jasondeanlee

Congrats! The highest compliment for food -- "edible" by Jason

1,471

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

To discover and understand the capabilities of LLMs, I believe this combination will become even more powerful. This is joint work with an amazing team of collaborators: Fan Chen (Peking U), @Song__Mei (Berkeley) @huan__wang @CaimingXiong (Salesforce)

866

Yu Bai · Jun 25, 2020 · 1:45 AM UTC

Yu Bai

@yubai01

25 Jun 2020

Joint work with @MinshuoC @jasondeanlee @tourzhao @huan__wang @CaimingXiong @RichardSocher at @SFResearch, Georgia Tech, and Princeton.

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

Personally, one thing I really like in this project is that, **both experiments and ML theory** (statistics, linear algebra with transformers) played crucial roles in isolating and rigorizing the phenomenon.

818

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

In the first mechanism, “Post-ICL Validation”, the TF executes many ICL algorithms in parallel on a train split, and outputs the one with lowest loss on a validation split. Example: A TF can do ridge regression with \lambda_1 on input 1 and \lambda_2 on input 2.

806

Yu Bai · May 15, 2024 · 4:39 AM UTC

Yu Bai

@yubai01

15 May 2024

Looking forward to seeing more exciting works from the team!

1,522

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

We coin this capability as "In-Context Algorithm Selection". This is similar to what a statistician / ML expert can do in real-life: Choose the best algorithm for their data at hand. How can a Transformer (TF) do that? We construct two mechanisms in theory.

894

Yu Bai · Jun 16, 2021 · 8:43 PM UTC

Yu Bai

@yubai01

16 Jun 2021

The Accuracy-Calibration frontier is much more informative than just calibration errors alone. Nice to see this extensive empirical study.

Matthias Minderer @MJLM3

16 Jun 2021

New paper: Revisiting the Calibration of Modern Neural Networks (arxiv.org/abs/2106.07998). We studied the calibration of MLP-Mixer, Vision Transformers, BiT, and many others. Non-convolutional models are doing surprisingly well! 1/5

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

++ Along the way, we develop a comprehensive & quantitative theory for TFs to do ICL: * Implementing many more ML algs by TF (Lasso, Logistic regression, neural networks...) * New efficient implementation of in-context gradient descent as backbone * Analysis of pretraining * ...

650

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

In the second mechanism, "Pre-ICL testing", the TF runs a certain distribution test to deci which ICL algorithm to use. Example: A TF can do linear regression on a regression problem, and logistic regression on a classification problem, using a binary type check.

736

Yu Bai · Aug 4, 2023 · 5:08 PM UTC

Yu Bai

@yubai01

4 Aug 2023

Replying to @yubai01 @GoogleDeepMind @misovalko @Tdash_Koz

X = number of information sets for a single player, the main measure of game size for EFGs. Their new ICML paper improves over ours in the H (game horizon) dependency, and importantly does not require the structure of the game tree to be known ahead.

534

Yu Bai · May 14, 2022 · 5:41 AM UTC

Yu Bai

@yubai01

14 May 2022

Replying to @ml_angelopoulos @davidstutz92

Thanks for flagging! Conditional coverage is definitely an important goal, great to see backproping thru conformal works here. May be interesting to see whether a proper efficiency loss could be designed for conditional coverage (and be optimized) too.

Yu Bai · Sep 17, 2020 · 8:11 AM UTC

Yu Bai

@yubai01

17 Sep 2020

Congrats team!

Caiming Xiong

@CaimingXiong

16 Sep 2020

Our NLP team got 16 papers (11 long, 2 short, and 3 finds) at #emnlp2020, which cover dialogue, summarization, question answering, multilingual, few-shot, NLI, semantic parsing, data augmentation, etc. Congrats to team members and coauthors. More info about papers coming soon!

Yu Bai · Nov 22, 2024 · 8:12 PM UTC

Yu Bai

@yubai01

22 Nov 2024

Replying to @max_simchowitz

Congrats Max!

476

Yu Bai · Feb 13, 2022 · 12:22 AM UTC

Yu Bai

@yubai01

13 Feb 2022

Replying to @Guodzh @SimonShaoleiDu

Yeah the meaning of self-play does depend on the context. In many theory works we do have >=2 *different* agents playing against each other, and we called it "self-play" too, to emphasize we don't need guidance from expert opponents / demonstrations (think AlphaZero vs. AlphaGo)

Yu Bai · Aug 16, 2023 · 9:23 PM UTC

Yu Bai

@yubai01

16 Aug 2023

Replying to @KuanFang @Cornell @CornellCIS @Cornell_CS

Congrats Kuan!!

470

Yu Bai · May 15, 2024 · 5:00 AM UTC

Yu Bai

@yubai01

15 May 2024

Replying to @CaimingXiong @SFResearch @huan__wang @silviocinguetta @RichardSocher

Thanks Caiming, it was great working with you!

444

Yu Bai · Aug 9, 2025 · 4:57 AM UTC

Yu Bai

@yubai01

9 Aug 2025

Replying to @chijinML @Song__Mei

those were really good days!

371

Yu Bai · Oct 26, 2022 · 7:56 PM UTC

Yu Bai

@yubai01

26 Oct 2022

Before we wrap: Our work is hugely inspired by the recent work of @gabrfarina @Chung_Wei_ @HaipengLuo @ChrKroer at #ICML2022: arxiv.org/abs/2202.00237 They used the "kernel trick" to obtain an efficient implementation of Hedge in NFG space, for learning NFCCEs. 14/

Kernelized Multiplicative Weights for 0/1-Polyhedral Games:...

While extensive-form games (EFGs) can be converted into normal-form games (NFGs), doing so comes at the cost of an exponential blowup of the strategy space. So, progress on NFGs and EFGs has...

arxiv.org

Yu Bai · Jun 8, 2023 · 5:54 PM UTC

Yu Bai

@yubai01

8 Jun 2023

These mechanisms not only match our findings in experiments. They also allow TFs to achieve strong ICL performance in theory. Example: We construct a TF to do nearly Bayes-optimal ICL in a challenging task---noisy linear models with **mixed** noise levels.

698

Yu Bai · Aug 7, 2025 · 5:42 PM UTC

Yu Bai

@yubai01

7 Aug 2025

h/t team @_chris_lu_ @SuvanshSanjeev @hadisalmanX and many more!!

839

Yu Bai · Oct 26, 2022 · 7:56 PM UTC

Yu Bai

@yubai01

26 Oct 2022

Our results generalize theirs to the case of EFCEs. Besides, we unveil a new connection in their setting as well: Hedge in NFG space = Kernelized MWU (Farina et al.'s efficient impl.) = Standard OMD with dilated entropy regularizer. Once again, OMD <-> NFG 😎 15/

Yu Bai · Aug 5, 2025 · 5:16 PM UTC

Yu Bai

@yubai01

5 Aug 2025

& stay tuned for more updates later this week.

1,119

Yu Bai · May 31, 2024 · 12:40 AM UTC

Yu Bai

@yubai01

31 May 2024

Replying to @nanjiang_cs

Congrats Nan!

675

Yu Bai · Feb 19, 2025 · 5:46 PM UTC

Yu Bai

@yubai01

19 Feb 2025

Replying to @WenSun1

Congrats!

220

Yu Bai · Feb 3, 2025 · 3:36 AM UTC

Yu Bai

@yubai01

3 Feb 2025

Replying to @EdwardSun0909

Congrats!

518

Yu Bai · Jul 17, 2023 · 7:30 PM UTC

Yu Bai

@yubai01

17 Jul 2023

Replying to @aviral_kumar2 @SCSatCMU @CSDatCMU @mldcmu @svlevine

Huge congrats @aviral_kumar2 !!

712

Yu Bai · Dec 10, 2023 · 9:19 PM UTC

Yu Bai

@yubai01

10 Dec 2023

Replying to @LoVVgE

Running into you at both the local street and NOLA! 🤣

149

Yu Bai · Jul 13, 2023 · 1:07 AM UTC

Yu Bai

@yubai01

13 Jul 2023

Replying to @ben_eysenbach @Princeton @mldcmu @rsalakhu @svlevine @PrincetonCS

Congrats Ben!!

475

Yu Bai · Oct 26, 2022 · 7:56 PM UTC

Yu Bai

@yubai01

26 Oct 2022

What's even nicer about the OMD connection: We build on this connection to design a modified OMD algorithm, that achieves better and the first near-optimal sample complexity for learning EFCE under bandit feedback. That is our second main result. 12/

Yu Bai · Nov 9, 2024 · 12:28 AM UTC

Yu Bai

@yubai01

9 Nov 2024

Replying to @lilianweng

We will miss you! Good luck on your new journey 🩵

882

Yu Bai · Apr 22, 2023 · 9:25 PM UTC

Yu Bai

@yubai01

22 Apr 2023

Replying to @martinjzhang @CMUCompBio

Congrats!

451