As I wrap up my thesis, I can’t help but look back on the past year of working on Diffusion LLMs.
People often ask me: why and how I got into this strange little world of discrete diffusion. I usually give the textbook answer: the kind you’d find in any random paper and make myself sound like some visionary who saw this whole field explode the way it did. Lie. A blatant lie.
The truth is simpler: the day I first stumbled upon this topic, I felt like a kid again. Remember that toy you loved so much as a child that you couldn’t stop obsessing over? That’s what discrete diffusion felt like to me. My head exploded with possibilities.
⚡️2023 was my golden year.
No. of papers published: 0.
BUT I was working on things that made me genuinely happy-- no expectations, no thought of return. That was the year I learned almost everything I know about diffusion. The seeds of MDLM and Duo were planted in late Nov / Dec.
I didn’t start formally working on discrete diffusion until early February'24, when SEDD came out and showed a lot of promise in this area.
P.S. I’ll miss my desk, which had a gorgeous view of Manhattan.