.@ArtificialAnlys just dropped a brand new leaderboard called AA-Briefcase for evaluating realistic tasks in complex projects.
Nemotron 3 Ultra ranks among the top open models, with strong performance across a wide range of long-running agentic tasks, even when encountering them for the first time.
๐ nvda.ws/4grnX1h
Jun 26, 2026 ยท 8:19 PM UTC
21
40
239

