Prime Intellect · May 7, 2026 · 3:12 AM UTC

Prime Intellect

Pinned Tweet

Prime Intellect

@PrimeIntellect

May 7

The next wave of AI will not be won by better prompts. It will be won by systems that learn from experience. Today, Prime Intellect Lab is out of beta, open for you to start training your own models. The era of self-improving agents is here.

204

2,007

1,345,726

will brown · Jun 26, 2026 · 12:09 AM UTC

Prime Intellect retweeted

will brown

@willccbb

Jun 26

something has definitely shifted in the past few weeks. seeing a huge uptick in large enterprises wanting to secure compute and post-train their own models in house, frequently on top of GLM-5.2. everyone is starting to understand how open source wins.

193

2,604

254,621

Vibrant Labs · Jun 23, 2026 · 4:15 PM UTC

Prime Intellect retweeted

Vibrant Labs

@VibrantLabsAI

Jun 23

1/n For browser agents, a major bottleneck in evaluation is truthful scoring on the live web. A task is only as good as your ability to confirm the agent actually did it, on a real site whose state keeps moving and that the agent can potentially misreport. So we took matters into our own hands. Today, we're releasing Ecom Bench on @PrimeIntellect: 40 shopping tasks on real Shopify storefronts, each run in a live @browserbase browser and graded by a deterministic verifier. vibrantlabs.com/research/eco…

ALT Cost vs Accuracy (DOM and CUA)

6,828

Mika Senghaas · Jun 23, 2026 · 3:23 AM UTC

Prime Intellect retweeted

Mika Senghaas

@mikasenghaas

Jun 23

this is a good one

Prime Intellect

@PrimeIntellect

Jun 23

Today we're releasing prime-rl v0.6.0 — enabling RL at trillion-parameter MoE scale on agentic workloads at the highest efficiency. We've relentlessly optimized our RL infra. The result: GLM-5 on agentic SWE tasks at 131k context and sub-5-minute step time.

6,174

Prime Intellect · Jun 23, 2026 · 2:16 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

Huge thanks to the @vllm_project team, and @robertshaw21 in particular, for all the help along the way. Also to the llm-d and Dynamo teams for the collaboration on routing and inference.

3,910

Prime Intellect · Jun 23, 2026 · 2:16 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

prime-rl is fully open source, and we're hiring systems engineers to take it further. Read the full prime-rl performance deep dive: primeintellect.ai/blog/rl-at…

RL at 1T Scale: prime-rl Performance Deep Dive

prime-rl 0.6.0 trains trillion-parameter MoE models on heavy agentic workloads at the highest efficiency. A deep dive into the inference and training optimizations behind it — from FP8 and wide...

primeintellect.ai

105

18,410

Prime Intellect · Jun 23, 2026 · 2:16 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

Over a long run the trainer and inference policies slowly drift apart, and that mismatch can kill your training. R3 (router replay) captures the routing decisions from the inference engine, replays them on the trainer - KL mismatch drops ~10x.

3,727

Prime Intellect · Jun 23, 2026 · 2:16 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

The trainer is 3D-parallel (FSDP2 + CP + EP), built on TorchTitan. FSDP2 shards params, grads & optimizer state. EP keeps experts sharded and routes tokens with all2all instead of all-gathering ~80GB per layer. CP handles the 131k context and GLM-5's DSA attention.

3,534

Prime Intellect · Jun 23, 2026 · 2:15 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

949

288,296

more replies

Prime Intellect · Jun 23, 2026 · 2:16 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

One Mooncake store pools KV cache across all nodes, so any worker can reuse any prefix. The router picks workers by a score over load, queue depth, KV usage and prefix overlap. You get cross-replica cache hits with balanced routing across the whole deployment.

3,868

Prime Intellect · Jun 23, 2026 · 2:16 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

We disaggregate prefill and decode onto separate workers. A long prefill used to stall decode for everyone. Now it doesn't.

4,679

Prime Intellect · Jun 23, 2026 · 2:15 AM UTC

Prime Intellect

@PrimeIntellect

Jun 23

In RL, inference is the bottleneck — we optimize for throughput, not latency. High concurrency, FP8 precision, and wide expert parallelism over 32+ GPUs. Every GPU holds its own slice of experts and acts as its own endpoint.

6,311

Johannes Hagemann · Jun 16, 2026 · 6:44 PM UTC

Prime Intellect retweeted

Johannes Hagemann

@johannes_hage

Jun 16

awesome post by @kimbochen covering RL systems end-to-end, including a SWE training run on GLM-5 using our prime-rl framework.

SemiAnalysis

@SemiAnalysis_

Jun 16

RL Systems Mind the Gap: Matching Trainer and Generator Throughput RL Training Infrastructure, GRPO, PipelineRL, Async RL, Policy Staleness, RL Sandbox Infra, CPU Requirements, TCO Analysis, Thinking Machines Tinker newsletter.semianalysis.com/…

145

20,898

elie · Jun 16, 2026 · 7:20 PM UTC

Prime Intellect retweeted

elie

@eliebakouch

Jun 16

nice blog by @kimbochen about the current RL ecosystem, goes into detail about the different settings and tradeoffs to consider when RLing open models

SemiAnalysis

@SemiAnalysis_

Jun 16

13,121

Vincent Weisser · Jun 17, 2026 · 11:32 PM UTC

Prime Intellect retweeted

Vincent Weisser

@vincentweisser

Jun 17

Excited to support this epic Inference-time compute hackathon with Prime Intellect credits for post-training + compute 24hr hack on > Agents: Multi-step systems that take a goal and execute. Tool use, planning, long horizons. > Real-Time and Interactive: Sub-second loops, live multimodal. > RL + Applied AI: Systems that judge capability of a person or a model. Autograders, preference ranking, rubrics, verified-skill signals, human-in-the-loop. luma.com/hncudfxb

Inference-Time Compute Hackathon · Luma

$50k top prize. $100k+ total. 24 hours to build. Anthropic, Etched, Cognition, and Mercor are hosting a 24-hour hackathon with compute support from Prime…

luma.com

6,887

Vincent Weisser · Jun 17, 2026 · 1:18 AM UTC

Prime Intellect retweeted

Vincent Weisser

@vincentweisser

Jun 17

We are so back! Future looking bright to post-train, serve, and continuously improve your own model on top of models like GLM-5.2 using primeintellect.ai/ 🫡

Prime Intellect - The Open Stack for Self-Improving Agents

The compute and infrastructure platform for you to train, evaluate, and deploy your own agentic models.

primeintellect.ai

Z.ai

@Zai_org

Jun 16

Introducing GLM-5.2: Frontier Intelligence, Open Weights - Significant improvements in coding and agentic tasks - Strong long-horizon capabilities with a 1M context window - Two levels of reasoning effort: GLM-5.2 (max) pushes the limits, while GLM-5.2 (high) strikes a strong balance between performance and token efficiency - MIT-licensed open weights - Same API pricing as GLM-5.1 Tech Blog: z.ai/blog/glm-5.2 Weights: huggingface.co/zai-org/GLM-5… API: docs.z.ai/guides/llm/glm-5.2 Coding Plan: z.ai/subscribe Chat: chat.z.ai

118

9,096

Vincent Weisser · Jun 17, 2026 · 12:20 AM UTC

Prime Intellect retweeted

Vincent Weisser

@vincentweisser

Jun 17

Great RL systems deep dive by @SemiAnalysis_ Scaling RL is as much of an infra problem as an algorithm one SemiAnalysis ran experiments on our stack: Prime RL + Sandboxes. System efficiency is ultimately queue health to match generator and trainer throughput

SemiAnalysis

@SemiAnalysis_

Jun 16

119

28,769

will brown · Jun 14, 2026 · 10:50 PM UTC

Prime Intellect retweeted

will brown

@willccbb

Jun 14

been beating this drum since early 2025, seems like people are starting to see why it's so important :) RL works -> "train or get trained on" -> open models + post-training infra are the path to institutional flywheels + democratization of AI progress

MidCurveCapital

@midwit_capital

Jun 14

The next big trade is infrastructure / RL environments that enable companies to turn their institutional knowledge / processes into continuously improving learning loops that they can own.

392

35,696

Vincent Weisser · Jun 14, 2026 · 9:38 PM UTC

Prime Intellect retweeted

Vincent Weisser

@vincentweisser

Jun 14

Satya is perfectly describing the why and what behind @primeintellect since 2023 🫡 > AI needs to be open & sovereign > Let every company create its own self-improving agents: and own their loop to make them better > A rich open ai ecosystem creates far more abundance than a future locked down by a few closed labs > Every company is becoming an ai company: so every company needs to own its own product <> model improvement loop @primeintellect enables this today: > Your own evals + rl envs for the outcomes you care about > models self-improving in production from your real traces > don't cede your moat to a handful of labs. This self-improvement loop is the IP and it compounds Open self improving agents for everyone 🫡

Satya Nadella

@satyanadella

Jun 14

x.com/i/article/206558289479…

A frontier without an ecosystem is not stable

I’ve been thinking a lot about the future of the firm in an AI-driven economy. This transition is different than any previous platform shift. In the past, we used digital systems to enhance human

228

25,054

Prime Intellect · Jun 10, 2026 · 9:14 PM UTC

Prime Intellect

@PrimeIntellect

Jun 10

By performing SFT on tool outputs and RL on the assistant tokens, we can efficiently teach the model the environment dynamics. This happens on-policy: the LLM models the environment not in a vacuum but in response to its own actions.

9,942

Prime Intellect · Jun 10, 2026 · 9:14 PM UTC

Prime Intellect

@PrimeIntellect

Jun 10

We show strong results in the under-resourced programming language Forth and evaluate generalization to unrelated environments. We also characterize what aspects of an environment lead to overfitting when using ECHO, how model behavior is impacted, and much more.

5,096

Prime Intellect · Jun 10, 2026 · 9:14 PM UTC

Prime Intellect

@PrimeIntellect

Jun 10

Read more: primeintellect.ai/blog/true-…

True Agents Model the World

World modeling during RL trains models to predict their environment in response to their own actions, improving in-domain generalization and efficiency in early ECHO experiments.

primeintellect.ai

4,817