Deploy chat, feeds, video, and moderation at scale with Stream – an API driven platform powering over 1B+ end users. Get started at getstream.io.

Boulder, CO
Today, we’re releasing the first version of our Vision AI framework! It's 100% open-source and has over 10 out-of-the-box plugins for @krispHQ, @ultralytics, @cartesia, @OpenAI, @GoogleAIStudio, and more! With Vision Agents, developers can build fast, low-latency vision AI applications running on Stream from day one with just a few lines of Python ⚒️
77
270
1,142
261,912
"Show, don't tell" is something we're committing hard to here at Stream. We applied this when making our new agent skills pages and we've very happy with the changes we made. What do you think of them?
4
448
We were so impressed with the model that our resident ⚽️ expert @stefanjblos built a demo around it for the #FIFAWorldCup Watch it translate some of the most legendary commentaries of goals and doing a surprisingly remarkable job at it!
Our latest audio model, Gemini 3.5 Live Translate, takes real-time speech translation to the next level for developers by delivering low-latency translation across 70+ languages. By processing speech as it streams in near real time, the model enables devs to build low-latency audio experiences with: — Multilingual input: Understands multiple languages in a single session without needing to adjust settings. — Auto-detection: Identifies the spoken language and begins translation instantly. — Native audio processing: Generates more natural-sounding speech that preserves speakers' intonation, pacing, and pitch. — Noise robustness: Filters out ambient noise for clearer conversation in loud environments.
3
4
18
6,666
We use the Haiku Benchmark we invented to test the Stream CLI. We get a load of Claude Haiku models to all try to complete the same task with the CLI.. If any fail, we improve our CLI code or help messages to make it more usable until they all succeed. Go ahead, roast us 🤷‍♂️
2
2
354
Hola, Bonjour, Hello Gemini 3.5 Live Translate 👋 This new model brings speech-to-speech in 70+ languages, auto-detection, native audio that holds onto tone and pacing, and continuous translation instead of turn-by-turn. Check out the docs below to get started 👇
Gemini 3.5 Live Translate is now available in public preview on the Gemini API and Google AI Studio. 💬 This model translates speech as it streams, giving developers a blazing-fast, low-latency engine to build some seriously cool audio apps. See it in action 👇
3
5
16
2,087
Most coding agents don't really know Stream. They guess at our APIs, hallucinate methods, and ship code that doesn't compile. Stream Agent Skills fix that ↓
10
11
114
3,159,405
One command, and you're set: npx skills add GetStream/agent-skills Router plus every specialist skill. Markdown only. No binaries, no postinstall scripts.
2
1
318
Congrats to the @MiniMax_AI team on the release of M3! 👉 A frontier-class open-weight model 👉 1M context window 👉 Native multimodality (image & video)
Introducing MiniMax M3: The First Open-Weights Model to Combine Three Frontier Capabilities - Coding & Agentic Frontier: 59.0% SWE-Bench Pro, 66.0% Terminal Bench 2.1, 34.8% SWE-fficiency, 28.8% KernelBench Hard, 74.2% MCP Atlas - MiniMax Sparse Attention scales context to 1M - Natively Multimodal from Step Zero API: platform.minimax.io Token Plan: platform.minimax.io/subscrib… 🚀New! MiniMax Code: code.minimax.io Weights & Tech Report in ~10 Days
1
2
12
12,142
Tomorrow, @AWS Summit Amsterdam is finally happening: and we're a part of it! Specifically, our DevRel Lead @d3xvn is going to share insights into building production-ready real-time voice agents with Amazon Bedrock's Nova Sonic 2.0! Link to session in 🧵 Who's joining? ☝️
1
2
4
665
We’re excited to work with @TencentRTC to bring their low-latency RTC network as a first class edge provider for Vision Agents ⚡️ Use your favourite LLM/STT/TTS models running directly on TRTC 🤖
We're teaming up with Stream to power the next generation of real-time multimodal AI agents! 🚀 @TencentRTC is now an official transport plugin for @visionagents_ai, offering: 🏆 3,200+ global nodes ⚡ Sub-300ms latency 🌍 Reliable performance in China & Asia Build and scale your multimodal AI apps globally with ease. 🤖🌐 Click here to explore more: visionagents.ai/integrations… #TencentCloud #TencentRTC #AI #Stream #VisionAgents #CloudTech #GlobalBusiness #TechPartnership
1
2
4
559
Stream retweeted
We're teaming up with Stream to power the next generation of real-time multimodal AI agents! 🚀 @TencentRTC is now an official transport plugin for @visionagents_ai, offering: 🏆 3,200+ global nodes ⚡ Sub-300ms latency 🌍 Reliable performance in China & Asia Build and scale your multimodal AI apps globally with ease. 🤖🌐 Click here to explore more: visionagents.ai/integrations… #TencentCloud #TencentRTC #AI #Stream #VisionAgents #CloudTech #GlobalBusiness #TechPartnership
1
2
12
298
Vision Agents now supports @TencentRTC 🎉 Low-latency multimodal AI across China and Asia-Pacific; same models, same workflows, new reach. 3,200+ nodes. Zero rebuild required.
2
4
190
Stream's agent skills can create an entire messaging app in one prompt! The skills explain how to build a Stream app to your coding agent. Prompt it, make some tea, come back to your working app. 🫖 Link in 🧵
1
2
250
Don’t start the weekend crashing out alone We teamed up with @inworld_ai and @Anam__ai to build a hyper interactive agent that watches your face and observes your tone. When you go quiet, it notices. When you look like you're about to lose it, it softens. Try it (you're welcome): visionagents.ai/
6
12
39
248,902
[say playfully] New TTS model release 🚨 Inworld’s TTS-2 model brings natural language steering, pause controls, rich emotions, prompting and much more. Upgrade to the latest version of Vision Agents to get started and check out @inwrold_ai release for more details 🧵👇
Introducing Realtime TTS-2, a new generation of voice model built for realtime conversation. It is the first voice model that hears the conversation, takes natural-language voice direction, holds one voice identity across over 100 languages, and speaks like a person who is paying attention. The result is voice AI that feels as good as it sounds. Try it out: tinyurl.com/RealtimeAI Learn More: tinyurl.com/TTS-2Blog
2
1
10
1,005