🆕
@latentspacepod: Is finetuning GPT4o worth it?
w/
@AlistairPullen of
@cosine_sh
Betteridge's law says no: with 59 different flavors of RAG, and >2million token context + prompt caching, it's reasonable to believe that "in context learning is all you need".
But Genie is the first to make a huge bet finetuning
@OpenAI GPT4o for code at the largest scale it has ever been used externally; resulting in what is now the #1 coding agent in the world according to SWE-Bench Full (30%), Lite (50%), and Verified (40%), by a country mile.
Most finetuning is in the <100m token range. It's no surprise that the results aren't that gamechanging.
We delve into the process of wandering the idea maze with YC, working with
@john__allard and co, and creating billions of tokens of synthetic code data from real user logs and purposefully sabotaging ASTs to create reasoning traces that exhibit:
- Perfect info lineage
- Incremental knowledge discovery
- Step by step decision making
Enjoy! Full pod link below.