The AI developer platform.🛠️ Track and evaluate your LLM applications in real-time with @weave_wb.

San Francisco
Most teams training RL agents optimize for tokens per second. For RL, that's the wrong number to chase. So we rebuilt our backend around trajectories per second. 💥 Meet AOM, a Megatron backend for our open-source library ART, with 12X the throughput of our old Unsloth backend.
2
7
37
29,058
Introducing CoreWeave ARIA, the first AI research agent that runs autoresearch in your W&B dashboard. It reads your runs, finds what's working, and launches the next experiment itself. See it on @karpathy's nanochat, proposing configs and launching real training runs. Watch👇 Chapters 0:00 The setup, nanochat runs on an A100 0:32 Ask ARIA to run autoresearch 1:00 Spin up a second ARIA in parallel 1:38 ARIA inspects prior runs and forms hypotheses 2:03 What happened, what worked, across every run 2:24 ARIA proposes configs, 3 trials via W&B Launch 2:59 Push for bigger architecture changes 3:14 Runs hit the launch queue on the A100 4:09 Two ARIAs running autoresearch in parallel 4:26 Results back, ARIA evaluates the val loss
4
7
38
75,727
Here are some use cases to get you going: - Why did my run fail?! 🫠 - Build a view of my runs and outliers - Why did these OOM? Check the logs and metrics - What hyperparams to try next? Launch it - Find a teammate's run and import it - How do I overlay metrics on a chart?
1
1
6
299
You're already sitting on the experiments. ARIA turns them into the next model. It works on mobile, too. 😉 Public preview is live now. Open any W&B project, hit the agent icon, and ask ARIA something real. More info below! utm.io/uq0Lc
1
8
285
Weights & Biases retweeted
Work life balance
3
2
29
3,404
Weights & Biases retweeted
Replying to @snwy_me
If you haven't heard of wandb.. where have you been?
1
5
1,147
Weights & Biases retweeted
Replying to @TonyMazur
me refusing to share my wandb logs out of pure paranoia 😭
1
1
658
Weights & Biases retweeted
(all my homies love @wandb )
1
2
760
Weights & Biases retweeted
Colorado's weather has been brutal with tornadoes and baseball-sized hail. So I built a research agent to dig into how severe weather actually gets forecast, and to help me understand severe weather more. 100% on GLM 5.2 via @wandb Serverless Inference, running on @CoreWeave. Every step traced in Weave's new agent view every model + tool call, captured.
1
1
2
781
What happens when you optimize your AI agent for customer satisfaction? Say a shipping company deploys an LLM trained to get thumbs up. Someone calls asking where their lost package is. The system can admit it's lost or say it's coming tomorrow. Saying the latter would make the customer happy and the agent would earn a thumbs up for lying. @profdanklein on Gradient Dissent: that's not a bug, but a reward function working exactly as intended. Full conversation in the comments.
1
2
5
1,436
Weights & Biases retweeted
It's one thing to see peak inference tok/s on @ArtificialAnlys It's a completely different thing to have this sustained across real world usage. @OpenRouter is the best place to see this rn, and... at least for now, @CoreWeave / @wandb is the fastest GLM you can get 👀⚡
6
4
21
2,551
Registry collection cards used to be a paragraph of plain text. Now they pull the same rich, interactive blocks as Reports, so a collection actually reads like a model card. We also shipped artifact panel grids to compare metrics across versions. 📊
2
9
34
1,404
We are rolling this out to SaaS, and will plan to ship to server in v0.83 globally.
1
2
401
Weights & Biases retweeted
LLMs lie. We build models that tell the truth. Today, we're excited to announce our $100M Series A led by @vkhosla and @KhoslaVentures. @profdanklein and I founded @ScaledCognition to solve the key challenge in AI, reliability.
14
10
80
418,522
Weights & Biases retweeted
Open weights just caught up to the frontier. GLM-5.2 from @Zai_org tops the open-model rankings on @ArtificialAnlys and @arena's Agent Arena. It's now live on CoreWeave Serverless Inference at $1.39 in and $4.40 out per 1M tokens. Ship more for less.
7
17
149
16,229
It's really interesting how @profdanklein thinks about hallucinations. His argument: every output from an LLM is technically a hallucination; some just happen to be right. No LLM ever knows whether its answer is right, where the information came from, or how reliable it is. So every answer your AI gives you is probably just a bet.
2
1
13
1,545
Full episode of Gradient Dissent: YouTube: wb.oia.bio/140yt Apple Podcasts: wb.openinapp.link/140ap Spotify: wb.oia.bio/140s
1
368
Weights & Biases retweeted
Night Night! Hope you grow up to have more than 50% success rate!
1
1
7
1,026