In AI research there is tremendous value in intuitions on what makes things work. In fact, this skill is what makes “yolo runs” successful, and can accelerate your team tremendously.
However, there’s no track record on how good someone’s intuition is. A fun way to do this is “betting”, where researchers try to predict the results of an experiment, or whether an approach would ultimately be successful.
When I was at Google Brain in 2022, I made a bet for what accuracy a 540B-parameter LLM would get on mate-in-one in chess after finetuning. I had great fun asking my friends to participate—their predictions ranged from 10% to 80% (I think it ended up being around 30%).
I particularly enjoyed a few bets with
@LiamFedus (now my manager at OpenAI). Back in the day when we were writing a paper on emergent abilities, we bet on whether he would be able to predict the final accuracy of a task based on the log-prob trends from smaller models, and I won that one. More recently, we had a bet on how much data would be needed for a model to reach a certain performance, and I lost that bet by an order of magnitude. It was a nice ego check for me. (Bro tip: if you bet a dinner, specify the price range before you lose)
Having a track record holds you to be accountable for intuitions and helps you remember when you were wrong. The best researchers excite their peers about only a few things, and some of those things work well in a big way. You don’t want to be excited about everything, but then only a small portion of those things actually work. Finally, I think there is also a lot of value in correctly predicting that and research direction won't go well—these “negative bets” aren’t typically rewarded in today’s culture, but I believe there is a lot of value in saving your team time.