there are many types of intelligence. which type do we want models to have?
yesterday i was analyzing agent trajectories in our rl envs and digging into tool-calling performance (how models use functions):
> gpt-5 made 10x fewer errors than every other model!
> claude made 10x more errors – but it reflected on its errors and fixed them. and so even though its initial tool calls were much worse, its final performance was close to gpt-5’s.
this shouldn't be possible under the standard paradigm. in rl, we're taught that only outcomes matter: the reward signal, the final state, the destination.
but that’s why i love digging into individual trajectories to understand what’s going on.
gpt-5 embodies precision intelligence: flawless execution, doesn't make the mistake in the first place.
claude embodies adaptive intelligence. it makes errors… but possesses something possibly rarer? the wisdom to notice and correct them?
it’s like that friend who shows up perfectly dressed, says exactly the right thing, never spills their drink vs. the one who trips walking in, knocks over a plant, makes a joke, and everyone laughs.
both are intelligent. which is better?
i don't know. but claude's error-recovery patterns seem closer to metacognition. it's monitoring its execution and thinking about its thinking.
gpt-5 may not need this layer right now. its first-order thinking is so accurate that it doesn’t need to reflect. maybe that's fine enough right now, when problems are straightforward enough to one-shot.
but what about when they're not?
(note: i don't know if gpt-5 is equally good at error-recovery when it needs to be. maybe it is! but i've noticed claude's recovery capabilities in the past.)
in an increasingly complex world, where problems get harder and harder, i wonder if resilience will matter more than perfection?
do you want the model that never falls, or the model that knows how to stand back up?