After 2 wonderful years, I left Meta this week. During this time, I worked on several projects related to speech and LLMs:
- Built the first multi-channel audio foundation model with M-BEST-RQ (
arxiv.org/abs/2409.11494)
- Made ASR with SpeechLLMs faster (
arxiv.org/abs/2409.08148) and more accurate (
ieeexplore.ieee.org/document…)
- Shipped the first production-ready full-duplex voice assistant (
about.fb.com/news/2025/04/in…)
- Improved Moshi’s reasoning capability with chain-of-thought (
arxiv.org/abs/2510.07497)
I am grateful to my managers for having my back on critical projects, and fortunate to have collaborated with several brilliant researchers and engineers during this time.
As to what's next, I am still in NYC and continuing to do speech research. More on that later!