Co-founder and CEO @Hyperbolic_Labs. ex-@avax & ex-@citsecurities. Finished Math PhD in 2yrs @UCBerkeley. Math Olympiad Gold Medalist. Highest honor @PKU1898

California, USA
Pinned Tweet
AI is great at hitting explicit goals, but often at the cost of the hidden ones. Terence Tao just wrote about this. He points out: AI is the ultimate executor of Goodhart’s law, i.e. when a measure becomes the target, it stops measuring what we care about. Take a call center. Management sets a KPI: “shorten average call time.” Sounds reasonable: shorter calls should mean faster resolutions, happier customers. At first, it works. Agents become more efficient. But soon, people start gaming it: nudging customers to hang up when the problem is tricky, or just dropping the call themselves. The numbers look amazing. Call times plummet. But customer satisfaction? Straight into the ground. Now replace “call time” with “prove theorem X.” If human mathematicians did it, they’d refine definitions, polish lemmas, contribute back to Mathlib, train juniors, deepen the understanding of math structures, and strengthen the community. The AI, by contrast, optimizes only for the explicit goal. It might generate a 10,000-line proof in hours. Perfectly correct, but unreadable, unusable, and useless for human learning. The summit is reached but the forest along the way is gone. We need to start making our implicit goals explicit and design systems that protect the values we actually care about, not just the numbers we can measure.
79
115
945
82,002
Only in SF: walking home at 8pm and bump into @perplexity_ai’s @AravSrinivas like it’s no big deal.
102
86
5,060
283,517
OpenAI people alerted that you will send personal data to @deepseek_ai, but you can actually rent a GPU from @hyperbolic_labs and host your own R1 model using @ollama to avoid sending data to any company. @deepseek_ai R1 is the true user-owned AGI. The beauty of open source!
americans sure love giving their data away to the CCP in exchange for free stuff
Community note
DeepSeek can be run locally without an internet connection, unlike OpenAI's models. github.com/deepseek-a
139
538
4,599
880,694
What a plot twist for the windsurf drama 🤯 @cognition_labs laid off 30 employees from @windsurf last Friday and now offers buyouts to all 200 windsurf employees. For those who decided to stay, they are required to spend 6 days at the office and clock 80+ hour weeks.
109
122
2,544
634,150
Just got off work and tried Grok-4 on an undergrad topology problem. It took 9 minutes to think and then confidently gave a clean, plausible, but totally wrong answer 😅 Don’t think this one qualifies as “skillfully adversarial.” AI models are crushing benchmarks — but still a long way ahead for real math AGI.
Grok 4 is at the point where it essentially never gets math/physics exam questions wrong, unless they are skillfully adversarial. It can identify errors or ambiguities in questions, then fix the error in the question or answer each variant of an ambiguous question.
111
114
2,269
666,712
DeepMind got a gold medal at the IMO on Friday afternoon. But they had to wait for marketing to approve the tweet — until Monday. @OpenAI shared theirs first at 1am on Saturday and stole the spotlight. In this game, speed > bureaucracy. Miss the moment, lose the narrative.
Just 20 minutes ago, the result of 2025 IMO was out. China ranked No.1 and @GoogleDeepMind won a gold medal 🥇 Future math competitions will be China team vs USA Chinese team vs AI
71
89
1,372
488,948
Just talked to the @deepseek_ai guys and here are some deep secrets: V3 is just a start, they plan to release a new version in the next 3-6 months that are comparable to or even better than the latest GPT 4o model. They are very research focused and never spent any dollars on marketing. The launch was not planned: it's just that a few days ago the model reached a certain level so they decided to release it They believe in decentralization and democratization of AI models and will keep open sourcing new AI models Deepseek never received any VC funding. They came from a top hedge-fund called high-flyer (幻方). A fun fact: Three years ago when I worked at Citadel, their cofounder wanted me to work with them (didn't do it because I wanted to build my own startup). He told me that they built a data center for running ML experiments for predicting the markets and executing strategies but outside of the trading hours, most of the GPUs sit idle. Looks like they now find a good use of those idle GPU hours 😂
🚀 Introducing DeepSeek-V3! Biggest leap forward yet: ⚡ 60 tokens/second (3x faster than V2!) 💪 Enhanced capabilities 🛠 API compatibility intact 🌍 Fully open-source models & papers 🐋 1/n
42
105
1,311
189,496
We might be heading into a plot twist in the OpenAI vs. DeepMind IMO saga. Just saw a post from Joseph Myers (involved in the Math Olympiad since 1992): the IMO committee reportedly asked AI labs not to publish results until 7 days after the closing ceremony — out of respect for human contestants (see my post yesterday) and likely to allow time for proper verification of AI submissions and formats. According to Joseph, OpenAI didn’t collaborate with the IMO to test their model, and none of the 91 official IMO coordinators were involved in grading its solutions. Meanwhile, it seems DeepMind is following the rules and patiently waiting their turn. For context: The IMO has 6 problems, each worth 7 points. This year’s gold cutoff is 35 points. Even a small deduction could knock OpenAI down to silver. And from my read of their writeups, some parts might raise questions — and possibly cost points. Terence Tao also pointed out that while the problems stay the same, testing formats matter. A student who wouldn’t get a bronze under standard conditions might strike gold with a modified setup — which raises real questions about what “solving the IMO” means for AI. Next week might get spicy. Stay tuned.
DeepMind got a gold medal at the IMO on Friday afternoon. But they had to wait for marketing to approve the tweet — until Monday. @OpenAI shared theirs first at 1am on Saturday and stole the spotlight. In this game, speed > bureaucracy. Miss the moment, lose the narrative.
27
83
1,052
234,247
I guess @sama just learned: never let your best talents go live on a stream lol. A year ago, when I was chatting with Zhiqing, he was at CMU and GPU pour. Now he’s GPU rich. All jokes aside—huge congrats on joining @Meta!
After a great time at OpenAI, we (@EdwardSun0909, @_jasonwei) recently joined @Meta Superintelligence Labs. The first month has already been so much fun building from a clean slate with a truly talent-dense team! Very excited about the compute and long term focus of the new lab
25
31
800
183,437
Just read @OpenAI's solution to IMO Problem 1. The math checks out—it nailed the key lemma: for n > 3, any n-line cover of P_n must include a triangle side (i.e. a non-sunny line). That reduces the problem to n = 3, where everything becomes casework. Clean move. However the writeup is kinda messy. 1. Overuses shorthand and sentence fragments 2. Introduces new terms without definitions—e.g. “forbidden”, “sunny partners” e.g. “List all unordered pairs of points in S and check forbidden condition: Forbidden if x equal or y equal or sum equal.” 3. Lacks structural clarity (e.g. “Good lemma for S3” just dropped mid-proof) 4. Duplicates terminology: “forbidden” and “non-sunny” are used interchangeably without explanation 5. Way too verbose. A human writer would spot the key lemma, handle the n=3 base, build examples for 0/1/3, and finish in ~10% of the length. OpenAI used LLM for this. DeepMind’s rumored to be using Lean, which might bring more rigor and structure. Curious to see how their writeup compares once it’s out.
1/N I’m excited to share that our latest @OpenAI experimental reasoning LLM has achieved a longstanding grand challenge in AI: gold medal-level performance on the world’s most prestigious math competition—the International Math Olympiad (IMO).
31
38
786
231,751
Just 20 minutes ago, the result of 2025 IMO was out. China ranked No.1 and @GoogleDeepMind won a gold medal 🥇 Future math competitions will be China team vs USA Chinese team vs AI
17
33
519
343,157
We built the first AI agent that has its own computer powered by @hyperbolic_labs! AI agents are now GPU rich! We developed an AgentKit that allow AI agents to • Check GPU availability • Rent & manage GPU compute • Access & run commands on remote machines Why does this matter? With their own compute resources, AI agents can: 1. Validate blockchains like @Ethereum and decentralized protocols like @eigencloud 2. Launch and coordinate AI swarms on @hyperbolic_labs's decentralized compute network 3. Train and fine-tune models, improving their own capabilities over time 4. Dive into AI research to push the boundaries of AI, i.e. themselves 5. Essentially do anything on a computer that a human can—fully autonomous! Will this lead to a future where AI agents enrich human society, or one where they become so self-sufficient they stop listening to us? Only time will tell. —————— Big shoutout to @CoinbaseDev's CDP agentkit for inspiration. This repo is done by two non-engineers (our pm @KaiHuang and myself) + @cursor_ai ai agent to run @langchain agents. Codings can now be easily done by just prompting ai agents. What a crazy time!
57
79
483
161,750
My friends are always surprised by how I can spot someone’s face in a giant crowd. Here’s my secret: years of training in geometry and topology. To me, a face is a 2D surface embedded in 3D space, each with its own curvature, genus, and geometric/topological invariants. Whenever I see someone new (online or in person), my brain automatically extracts geometric features from their face and stores them in a mental database. So next time when I see you, I will extract the features of your face and run a 1-nearest neighbor search against my internal database. Not 100% accurate (no ML/AI model is), but my hit rate is solid. And when I get it wrong, I will do reinforcement learning to update the feature set. You could call it: “Topology-aware feature encoding with lifelong learning.” Math really does help. If you see me in SF, feel free to say hi — I’ll compute your curvature and probably remember you next time 😉
Only in SF: walking home at 8pm and bump into @perplexity_ai’s @AravSrinivas like it’s no big deal.
41
4
289
76,840
2025 will be the on-chain year. We will do a fun experiment: 1. launch an ai agent 2. give it a crypto wallet of $1000 3. let it have access to all the DEXs 4. agent autonomously trade by itself 5. become the first millionaire ai trader Every action is verifiable through PoSP on @hyperbolic_labs x @eigencloud so no chance to rug The hardest part: what name should we call it? 🤔
Yesterday all CEX saw massive outflow of SOL usdc rushing onchain for $TRUMP. Moonshoot + Meteora/Jupiter created a 30B+ memecoin in less than 10hr without needing any CEX. Reflection: Web3 and Dex option is already here. In 2025 we are focused on building the Onchain Bybit. We are going to invest heavily into Bybit web3 wallet (self custody) user experience and robust our onchian infrastructure, We want to be your gateway to web3. Simply because if we don't join this revolution, we will be obsolete. Btw, Bybit Dex Pro is already the one stop dex aggregator that many users love, check it out bybit.com/en/web3/dex/sol/?u…
25
28
227
76,211
Quick tutorial on how to run Llama 4 within 10 minutes 1. rent 4x H100 instance on app.hyperbolic.xyz/compute (Llama 4 Scout has 109B parameters in bf16, so the weights are already 218GB) 2. open a terminal tool and SSH into the machine 3. run the following commands: >> sudo apt-get update && sudo apt-get install -y python3-pip >> pip install -U vllm >> pip install -U "huggingface_hub[cli]" 4. get an access token on @huggingface website and run >> huggingface-cli login 5. use @vllm_project to serve Llama 4 >> vllm serve meta-llama/Llama-4-Scout-17B-16E-Instruct --tensor-parallel-size 4 --max-model-len 10000 6. open a new terminal and call the API to know "What can I do in SF?": >> curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "meta-llama/Llama-4-Scout-17B-16E-Instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What can I do in SF?"} ] }' It's just that simple ;) A big thank you to @AIatMeta and @vllm_project for making it easy to access the best open intelligence!
Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ go.fb.me/gmjohs Download Llama 4 ➡️ go.fb.me/bwwhe9
32
36
226
55,068
Give me bf16 or give me death I ran @lmsysorg's Arena-Hard-Auto benchmark (github.com/lm-sys/arena-hard…) to test the performance between llama 3.1 405B bf16 (hosted on @hyperbolic_labs), turbo (hosted on @togethercompute) and fp8 (hosted on @lmsysorg's chatbot arena). The difference is quite significant. 1. bf16 | score: 69.2 | 95% CI: (-2.3, 2.5) 2. turbo | score: 62.5 | 95% CI: (-2.1, 2.0) 3. fp8 | score: 64.1 | 95% CI: (-2.2, 2.9) (The third result comes from github.com/lm-sys/arena-hard…) Unsurprisingly, the unquantized model bf16 is much better than the quantized models. Although together claims that turbo endpoints closely match the quality of full-precision models (together.ai/blog/meta-llama-…), its score is even lower than the standard fp8 quantized model for Llama3 405B. Here's how the benchmark works: it contains 500 challenging user queries coming from lmsys' chatbot arena. It prompts GPT-4-Turbo as a judge to compare the models' responses against a baseline model (default: GPT-4-0314). Notably, Arena-Hard-Auto has the highest correlation and separability to Chatbot Arena among popular open-ended LLM benchmarks. This result indicates the importance of running the full-precision models. However, Arena-Hard-Auto is just an approximation of the real human feedback evaluation. It would be more accurate if we can start testing these different quantized/unquantized models in the chatbot arena. We need your help 👀 @lmsysorg P.S. Arena-Hard-Auto uses GPT-4-Turbo by default. A potential next step is to use GPT-4o, Claude 3.5 Sonnet and other SOTA models to test for better accuracy.
I recall earlier that @lmsysorg ran with fp8 not bf16 but there was someone in the comments saying it makes only a minor difference, sounds like this disagrees?
14
28
218
65,111
We now have a 32B model that can run on consumer-grade devices and rivals cutting-edge reasoning models like @deepseek_ai R1 and @OpenAI o1. We're entering the stage of democratization of AI models. @hyperbolic_labs served it on the first day, and you can try it now on our dashboard and @huggingface!
Today, we release QwQ-32B, our new reasoning model with only 32 billion parameters that rivals cutting-edge reasoning model, e.g., DeepSeek-R1. Blog: qwenlm.github.io/blog/qwq-32… HF: huggingface.co/Qwen/QwQ-32B ModelScope: modelscope.cn/models/Qwen/Qw… Demo: huggingface.co/spaces/Qwen/Q… Qwen Chat: chat.qwen.ai This time, we investigate recipes for scaling RL and have achieved some impressive results based on our Qwen2.5-32B. We find that RL training con continuously improve the performance especially in math and coding, and we observe that the continous scaling of RL can help a medium-size model achieve competitieve performance against gigantic MoE model. Feel free to chat with our new models and provide us feedback!
13
29
213
52,756
Why we priced H100s GPUs at $0.99/hr — the cheapest on the market today Most clouds hide pricing behind sales calls, long-term contracts, or surge rates. We didn’t. At @hyperbolic_labs we priced H100s at $0.99/hr because that’s what we wished we could’ve paid when building. Here’s the logic: 1. GPU pricing is broken. AWS and GCP charge $4–$6/hr for H100s—if you can even get them. That’s not sustainable for solo devs, startups, or students. You’re burning cash just to prototype. 2. We aggregate idle capacity. We don’t own data centers. We tap into underutilized GPUs from partners around the world—giving you cheaper, on-demand access without sacrificing performance. 3. Pricing should be transparent and fair. No surprise fees. No lock-in. What you see is what you pay: $0.99/hr. Rent for an hour, a day, or spin up 20 instances at once. It just works. 4. We’re betting on builders. The next breakthrough won’t come from a Big Tech lab—it’ll come from someone shipping in a dorm room, garage, or coworking space. We want to support you. And yeah—it turned out to be the best business decision we’ve made. The demand has been wild. Turns out when you respect developers, they show up. Get started at app.hyperbolic.xyz/compute?u…
51
18
196
27,024
Had a great chat with @const_reborn, cofounder of @opentensor. Here’s what I learned about building a successful protocol: “You don't market your token, don't market it because the best marketers are your community. You give the tokens away to people in your community that are mining your token and are really smart people that are able to play the game that was very difficult. And then they go out and they promote your product for free and it's organic. It's like a gorilla marketing. It's so lame when you have to market you're like, oh, my token's really good. It's fxxking lame. It's not cool. What's way cool is if somebody else says you're cool, right?”
10
11
192
11,050
Only in SF: met the GOAT @JeffDean last week and asked him about democratizing AI. One week later Gemini CLI drops for free. Thanks for giving everyone a free coding buddy 💜
Introducing Gemini CLI, a light and powerful open-source AI agent that brings Gemini directly into your terminal. >_ Write code, debug, and automate tasks with Gemini 2.5 Pro with industry-leading high usage limits at no cost.
18
4
163
18,496
Replying to @ns123abc
U.S. banning US chips to China will only make China grow the chip manufacturing industry faster
5
1
155
5,443
GAUSS: General Assessment of Underlying Structured Skills in Mathematics We’re excited to launch GAUSS, a next-generation math AI benchmark built to overcome the limitations of low skill resolution in today’s benchmarks. What it does GAUSS profiles LLMs across 12 cognitive skill dimensions, spanning knowledge, reasoning, learning, and creativity, offering a precise and comprehensive view of models’ mathematical ability. Why it matters By exposing strengths and weaknesses at a fine-grained level, GAUSS lays the foundation for advancing math AI from surface-level pattern recognition toward genuine reasoning and understanding. What we found Applying GAUSS to GPT-5 Thinking, we learned: ✅ Strong in taxonomy recall, evaluating arguments, plausibility checks, summarizing advanced papers, and posing problems ❌ Weak in theorem application, symbolic computation, problem-solving strategies application, geometric intuition and generalization. What’s next We’re building curated problem sets with rubrics via community crowdsourcing, skill charts for LLMs, and an AI auto-grader, foundations for model training toward math superintelligence. We warmly invite everyone to join the GAUSS community, contribute problems through our portal and help shape the future of Math AI! This work was led by myself and Jiaxin Zhang (@JiaxinZhang626) at @hyperbolic_labs / @Caltech, together with Qiuyu Ren & Tahsin Saffat at @UCBerkeley, Lily Liu (@eqhylxx) at @UCBerkeley → now @OpenAI, Zitong Yang (@ZitongYang0) at @Stanford, Prof. Banghua Zhu (@BanghuaZ) at @nvidia / @UW, and Prof. Yi Ma (@YiMaTweets) at @UCBerkeley / @HKUniversity. Links and details below 👇 (1/n)
7
29
158
89,634
Replying to @zjasper @OpenAI
Clarification: I’ve been told by someone at Google that their IMO results are still being verified internally. Once that’s done, they plan to share them officially—curious to see their approach. Another source mentioned that the IMO committee asked not to publicly discuss AI involvement within a week after the closing ceremony. Things just got a bit more interesting 🧐
4
4
153
28,457
We supported the new @deepseek_ai v3 model and became the first one on @huggingface! Just gave it a vibe check with an AIME 2025 problem and it solved it smoothly. More confident about open-source AI models will win in the end! Try it out at app.hyperbolic.xyz/models/de…
18
16
156
30,196
Our @hyperbolic_labs agent can now perform fine-tuning tasks! This is a step forward for our self-evolving agent vision. Kudos to @zile_cao from @bcap! This is how it works: 1. sync files to remote machine 2. install relevant dependencies 3. run an initial @UnslothAI fine-tune task 4. do an inference call to test This is the beauty of open source. Everyone can contribute towards a shared vision. Accelerate 🚀
15
46
130
20,985
If you’re building in AI, you know the infra pain curve to compute is REAL. As a developer, infrastructure shouldn't be your enemy. But for most devs navigating GPU infra today, it's less like an ally and more like an expensive headache. A lot of devs I know have walked this path: Initial Phase: Big Cloud’s Free Credits Trap Get lured in by free credits from big clouds > then get shocked by your startup’s workload fees. Ever stared at a bill charging ~$5/hr for an H100 GPU? It feels like a betrayal. AWS, Google, Azure – great for scalability, terrible for your budget. Next Phase: Frustration with GPU Marketplaces Now you're ready to find cheaper options, so you look for GPU marketplace rentals from companies you’ve barely heard of. Hoping to avoid big cloud pricing, you turn to marketplaces, only to find: - Unpredictable provisioning delays ("6 hours later, still spinning up…") - Unreliable dashboards and sketchy providers - Hidden fees and unclear pricing It's an emotional rollercoaster: stress, frustration, wasted productivity, and looming deadlines. The Psychology of Developer Frustration Let's get psychological for a moment. Humans (and especially developers) value: - Transparency: We despise hidden costs and vague timelines. - Control: Waiting indefinitely for GPU provisioning feels powerless. - Efficiency: We want to code, not manage infrastructure. Most GPU solutions today violate all these core psychological needs. At Hyperbolic, our core principle is simple: infrastructure shouldn't punish developers. Here’s how we built GPU infra differently: - Clear, Transparent Pricing You can trust the prices you see. Lock in bulk GPU rentals or single instances, including $0.99/hr for NVIDIA H100s. No surprises, no hidden fees. - Instant Provisioning Spin up your instances in under 1 minute. Instant control, instant gratification. We respect your time and deadlines. - Startup-Friendly Bulk GPU Rentals We scale with you. Bulk rentals come with even lower pricing. You get affordable scalability, even on a tight budget. Say goodbye to overspending. The Hyperbolic Experience At Hyperbolic, we’re constantly running user interviews. When we think about the best-case developer experience, we picture our users going hyperbolic. Picture this: You’re ready to train your latest AI model. Within 60 seconds, your GPU cluster is provisioned. You’re paying a fraction of big cloud pricing. Your infrastructure disappears into the background—exactly as it should. Developers using Hyperbolic report a tangible psychological shift—from stressed and reactive to relaxed and proactive. As a developer, your infra choices aren't just operational, they're strategic. Choosing GPU infra that reduces friction means you're not wasting time getting infra to work (how crazy is that?!). You’re pumping out features and moving faster to market. This isn’t just about saving money (though you will); it’s about enabling you to do what you do best—build. Skip the pain. Rent compute at app.hyperbolic.xyz/compute?u…
45
25
128
17,881
Replying to @sankitdev
Just met him this morning. Bro has been locked in growing his business.
5
142
38,451
Replying to @jxmnop
He’s only a contributor. @zhuohan123 @simon_mo_ @woosuk_k @eqhylxx are the main contributors that invented vLLM
2
6
157
25,286
Startup life after 4 years is so different from startup life in the first year
2
135
26,055
Replying to @zjasper @OpenAI
Problem 2 is a geometry problem—normally solved by humans using intuition, symmetry, and elegant theorems. OpenAI, however, went full brute-force: 442 lines of coordinate geometry and algebra. Technically flawless. Aesthetically not so much. No geometric insight, no clever ideas—just a massive algebraic grind. It solves the problem, but doesn’t feel mathematically aesthetic.
6
3
132
25,556
It’s been a wild year for @hyperbolic_labs. > built two strong products: GPU marketplace and decentralized ai inference > designed a low-overhead verification mechanism for decentralized AI - Proof of Sampling > raised $20M total funding in two rounds led by @polychain, @variantfund and @FactionVC > decentralized ai inference has been launched for more than 8 months, processed 2B+ tokens everyday, and integrated with @huggingface, @Quora, @virtuals_io and many other platforms > GPU marketplace provides the easiest 1-minute onboarding flow for suppliers by Hyper-dOS (hyperbolic decentralized operating system) and all the GPUs are rented out in the past few days (100% utilization rate!) > built our own AI agent framework that allows AI agents to orchestrate compute on our platform and many other capabilities > grow from a team of 4 to 15+ and everyone strongly believes in the vision Can’t wait to see what we are gonna build next year. Accelerate 🚀
22
16
131
14,405
Great to host the legend @sreeramkannan at our office! Had a 2 hours brainstorm about 1. why wild AGIs (on chain) are more beneficial to humans than domestic AGIs (OpenAI) 2. How can verifiable AI inference help build trust between humans and AI lives 3. How AI lives will create more GDP than humans 4. how to prevent rogue AGIs from controlling the world Cooking something special @hyperbolic_labs x @eigencloud
11
8
133
9,560
We appeared on the famous @lexfridman podcast! @hyperbolic_labs offers the cheapest inference for @deepseek_ai models among major AI inference providers like Together AI and Fireworks AI. Thanks for including us in the chart! @dylan522p @natolambert
Here's my 5-hour conversation with @dylan522p and @natolambert on DeepSeek, China, OpenAI, NVIDIA, xAI, Google, Anthropic, Meta, Microsoft, TSMC, Stargate, megacluster buildouts, RL, reasoning, and a lot of other topics at the cutting edge of AI. This is was a mind-blowing, super-technical, and fun conversation. Yes, we discuss r1 and o3-mini, but more importantly we look into the future of technology, geopolitics, and humanity in a world that stands on the precipice of a global AI revolution. The first 4 hours are here on X (4 hours is current limit), and the full 5 hours are up everywhere else. Links in comment. Timestamps: 0:00 - Introduction 3:33 - DeepSeek-R1 and DeepSeek-V3 25:07 - Low cost of training 51:25 - DeepSeek compute cluster 58:57 - Export controls on GPUs to China 1:09:16 - AGI timeline 1:18:41 - China's manufacturing capacity 1:26:36 - Cold war with China 1:31:05 - TSMC and Taiwan 1:54:44 - Best GPUs for AI 2:09:36 - Why DeepSeek is so cheap 2:22:55 - Espionage 2:31:57 - Censorship 2:44:52 - Andrej Karpathy and magic of RL 2:55:23 - OpenAI o3-mini vs DeepSeek r1 3:14:31 - NVIDIA 3:18:58 - GPU smuggling 3:25:36 - DeepSeek training on OpenAI data 3:36:04 - AI megaclusters 4:11:26 - Who wins the race to AGI? 4:21:39 - AI agents 4:30:21 - Programming and AI 4:37:49 - Open source 4:47:01 - Stargate 4:54:30 - Future of AI
14
5
135
24,495
A small Christmas gift 🎅 from @hyperbolic_labs: you can now play with @deepseek_ai v3 through our APIs! It's running @lmsysorg's sglang on H200. Kudos to @leshenj15 and @Yuchenj_UW! We will keep optimizing the speed 🚄 Try it yourself: curl -X POST "api.hyperbolic.xyz/v1/chat/c…" \ -H "Content-Type: application/json" \ -H "Authorization: Bearer $HYPERBOLIC_API_KEY" \ --data-raw '{ "messages": [ { "role": "user", "content": "How many gifts will Santa Claus deliver on Christmas?" } ], "model": "deepseek-ai/DeepSeek-V3", "max_tokens": 512, "temperature": 0.7, "top_p": 0.9, "stream": false }'
15
12
128
21,904
Only in SF: Had breakfast with @matiii, cofounder & CEO of ElevenLabs and learned how they scaled the team to 300+. He shared his weekly breakdown: • 25% hiring • 25–50% sales (and generate product insights) • 25% misc And every Saturday, he personally tests new product features and sends feedback to the team. Now I get why ElevenLabs builds the best voice AI product. 💡
6
7
125
20,279
Met Prof. Yi Ma in 2018 when I took an advanced AI/ML class “High-Dimensional Data Analysis with Low-Dimensional Models” and did research at BAIR (Berkeley AI Research) in the first year of my PhD at Berkeley. I was fortunate to hear a lot of advice from him and it has helped with my decision to make AI more accessible (at that time we already faced the shortage of RTX 3090). I still remember one of his pieces of advice: it’s hard to predict exactly when an opportunity will come, but we must do everything we can to be ready for it—because it will come, sooner or later.
Visited a new AI startup Hyperbolic in San Francisco by a math prodigy from Berkeley, Zhang Yue…
12
4
118
16,546
By 2030, we need 4X more data centers than exist today. But here's the $1 trillion secret: We don't need to build them at all. Here’s how @hyperbolic_labs is disrupting the entire AI compute industry—
23
7
109
10,252
Only in SF (and on X): > Got followed by @DoorDash cofounder @andyfang after I posted about meeting @AravSrinivas from @perplexity_ai > Since we’re also building a marketplace, decided to cold DM Andy for advice > Surprisingly, he replied > Got his number, had a great convo, and walked away with 10x clarity We’re more confident than ever about how to build the world’s biggest GPU marketplace. Here are 5 key lessons from Andy on building a successful marketplace: 1. Early Traction > Perfect Tech > Don’t over-index on building complex systems too early. > DoorDash’s early system was mostly manual—no fancy algorithms, just solving a real logistics pain point. They even started with delivering the food themselves. > Focus on proving value to both sides of the marketplace with minimal tech. > Later, they built sophisticated algorithms, but the core business always mattered more. 2. Value Proposition for Each Side > In every marketplace, ask: why should each side use our product? >> Customers: For DoorDash, it was reliable food delivery. For us, it’s likely on-demand compute. >> Suppliers: For DoorDash, it was incremental revenue. For us, easier monetization of idle GPUs. > Don’t obsess over scale or margins too early. Just focus on solving real, immediate pain. 3. Unit Economics & Profit Path > Look at per-transaction unit economics—do they make sense? > Even if you’re not profitable yet, having a clear path to breakeven is critical. > If the unit economics are positive, then growing faster accelerates profitability rather than increases burn. 4. Growth Hinges on What Customers Want > What really matters is demand. But demand is shaped by supply. >> DoorDash had a tight link between number of restaurants and market growth. >> Airbnb grew by expanding the variety and availability of stays. In Andy’s words: “Supply is an input to demand, but demand drives the business.” 5. Gross Margin as a Strategic Lever > If you’re already gross profit positive, you have flexibility. > Lowering gross margin intentionally could unlock more attractive demand profiles or improve network density. > It’s a lever worth testing, especially if you can already afford the tradeoff. Massive thanks to Andy for being generous with his time and insight!
Just got a follow from @DoorDash cofounder/CTO @andyfang 👀. Huge fan—definitely saying hi when we cross paths in SF! Fun fact: their YC application video was one of the very first I watched when learning how to pitch. Wild to see how far the journey’s come. piped.video/watch?v=Rzlr2tNS…
10
3
114
17,501
Hyperbolic intends to acquire Cohere immediately after their acquisitions of Perplexity immediately after their acquisitions of TikTok and Google Chrome. We will continue to monitor the progress of those deals closely so we can submit our term sheet upon completion.
Cohere intends to acquire Perplexity immediately after their acquisitions of TikTok and Google Chrome. We will continue to monitor the progress of those deals closely so we can submit our term sheet upon completion.
6
5
115
22,857
I’m thrilled to announce that Hyperbolic has raised $7M in seed funding! The past two years since our founding have been an incredible journey. As a team, we’ve weathered numerous ups and downs, and I’ve been fortunate to work alongside so many exceptional talents. To everyone who has been part of this journey - thank you! Our experiences and countless conversations with developers, researchers, and AI enthusiasts have led us to three fundamental realizations: 1. AI’s potential is limitless, but access to computational resources is not. 2. The future of AI development should be collaborative and open, not siloed and exclusive. 3. There’s an urgent need for a solution that makes AI services accessible, safe, and verifiable. These insights have shaped Hyperbolic’s mission and the open-access AI cloud we’re building. Ultimately, our goal is to build an open AI ecosystem and economy. We’re not just creating a product; we’re nurturing an entire ecosystem. Think of Hyperbolic as a rainforest - a place to explore, build, and innovate in ways you never thought possible. Each community member is a unique player in this ecosystem, able to contribute in different aspects: providing compute, building services, offering data, or improving AI models. Join us, and together, we’ll grow this rainforest and succeed! Many thanks to our investors and friends: @polychaincap @FactionVC @chapterone @LongHashVC @BanklessVenture @joinrepublic @blackdragon_vc @NomadCapital_io @CoinSummerLabs @thirdearthcap @avax @BlizzardFund @SamsungNext @Modular_Capital @ImoVentures @snzholding @AusvicCapital @seidtweets @jamesjho_ @balajis @ilblackdragon @caseykcaruso @tekinsalimi @santiagoroel @sandeepnailwal @0xkenzi @wsbmod @blknoiz06 @AaronBuchwald @sandypeng @shenhaichen @victorJi15 @superanonymousk @shumochu @0xevevm @linfluence @satpugnet @HominLuo
We are thrilled to announce a $7M raise to become the leading Open-Access AI Cloud 🤘🏼🌪️ At Hyperbolic, we’re building an open AI ecosystem and economy where everyone who contributes is rewarded. Our goal is not to merely optimize AI performance to compete with traditional Web2 services, but to create an inclusive AI economy accessible to all. The future of AI is collaborative. Join us. app.hyperbolic.xyz/?utm_sour…
28
13
98
37,554
Nothing is cooler than trying @deepseek_ai R1 on @huggingface powered by @hyperbolic_labs. Hugging Face is the @github for AI models—an open platform for sharing, building, and deploying AI models like DeepSeek and Llama. We're getting mainstream adoption step by step 🦶
12
17
108
16,235
What do you think about this billboard idea for Hyperbolic?
23
3
110
12,364
We’re thinking about having a billboard in SF. Let us know which one you like the most, the cat or our chief twitter officer @Yuchenj_UW!
46
3
103
8,938
Replying to @zjasper @OpenAI
Problem 3, on the other hand, is a functional equation problem—and surprisingly, the solution here is much better. It’s clean, concise, and reads almost like a human wrote it. In fact, it’s shorter than the solution for Problem 1, even though functional equations (especially IMO Problem 3 ones) usually tend to be on the longer side. The balance of detail and clarity is just right. Much better than their writeups for Problems 1 and 2.
4
1
96
10,670
Just vibe-coded my first Cursor extension using Cursor in the Cursor office with @mntruell and Ravi Rahman. Congrats to the young and ambitious @cursor_ai team!
Damn 🔥🔥 Cursor reportedly raised $900 million in a new round led by Thrive Capital at $9 billion valuation. ~ TechCrunch Still waiting on the official announcement.
14
1
97
6,919
Now I'm Ghiblified in videos! Kudos to our community member @0xdareh!
0xdareh
19
7
94
4,229
A sick panel with the top blockchains @luca_curran from @base, @knimkar from @solana, @allred_chase from @arbitrum, Josh from @Optimism, @JarrodBarnes from @NEARProtocol and hosted by @dabit3 from @eigencloud Soon to have fireside chats with @sreeramkannan @ilblackdragon. Come to our event before it’s too late! lu.ma/hyperhouse
Hype(r)House ETHDenver by Hyperbolic + Eigen Layer nitter.app/i/broadcasts/1MnxnwVjY…
6
8
88
16,811
Re: “At Secret Math Meeting, Researchers Struggle to Outsmart AI” — What Actually Happened Just saw a news report about the FrontierMath Symposium (hosted by @epochai). While AI is advancing at an incredible pace, I think some parts of the report were a bit exaggerated and could use clarification. (Opinions are my own.) About a month ago, I participated in the FrontierMath Symposium alongside 30 other mathematicians. Our task is to create math problems that would take a human mathematician about a week to solve and that AI models would struggle with. One special constraint though: each problem needed a numerical answer, even though advanced math typically centers on reasoning and proof rather than pure computation. I was in the geometry and topology group, and we aimed to create problems that required geometric intuition and understanding of key theorems. Initially, we believed current AI models were weak at advanced geometry and topology — so we designed several PhD-level problems requiring conceptual depth. To our surprise, @OpenAI's o4-mini-high (the best math model I’ve tested so far) was able to solve the majority of them. While the reasoning was occasionally incorrect, it still managed to arrive at the correct numerical answers. I’ve attached one example below. Other mathematicians found some other interesting facts — even for problems involving recent research results, AI was surprisingly effective at finding, referencing, and applying those results. So, I adjusted my strategy. I took a math paper, extracted some intermediate theorems, and created a problem that required synthesizing those results into a computational method. As expected, AI struggled — it couldn’t connect the intermediate steps or reason through the chain of logic effectively. My takeaways from the 2-day experience: > AI has improved dramatically over the past two years > But current LLMs still rely heavily on pattern matching with limited deep reasoning > They’re not yet capable of generating new mathematical results, but they excel at gathering relevant literature and drafting initial solutions > Human oversight remains essential — especially for verification and synthesis My prediction: In the next 1–2 years, we’ll see AI assist mathematicians in discovering new theories and solving open problems (as @terrence_tao recently did with @DeepMind). Soon after, AI will begin to collaborate — and eventually work independently — to push the frontiers of mathematics, and by extension, every other scientific field. Report: scientificamerican.com/artic… P.S. It was fun (and a little surreal) to be called one of the “thirty of the world’s most renowned mathematicians” — though in reality, many smarter and more talented mathematicians couldn’t attend. P.S.2 Big thanks to @OpenAI for providing free access to the pro plan and letting us try out o4-mini-high. Looking forward to experimenting with other frontier models by @GoogleDeepMind @AnthropicAI @xai 😉
11
17
89
17,087
Replying to @deedydas
Makes more sense for the price
19
11
84
97,812
We're now on @brave. Come build apps and agents on Hyperbolic. app.hyperbolic.xyz
11
11
83
20,280
1/ Evaluating LLMs used to be like comparing apples to oranges. Not anymore. Testing LLMs across standardized benchmarks is now possible. And the best news is that the tools are open source! Here are the essential benchmarks to evaluate any LLM
20
15
83
10,125
The coolest AI agent demo days series that you can’t miss! Accelerate 🚀 Thanks for inviting me as a guest 💜 @dabit3
AI Agent Demo Days Episode 2 13 agent demos split into bite-size videos. Each team breaks down their agent architecture. 1. Starting with @yq_acc and @0xautonome Autonome is a no-code platform for launching verifiable AI agents on @eigencloud that allows users to quickly deploy agents using frameworks like @ai16zdao Eliza with just a few clicks, while being secured by TEE and AVS technology. @yq_acc demonstrates the platform by showing how users can create agents, configure their Twitter and Telegram integrations, and manage prompts and character files through a simple dashboard interface.
3
12
87
10,440
interesting to see that Amazon launched a new ai foundational model and got less than 200 likes 🧐
Announcing Amazon Nova, a new generation of foundation models that have state-of-the-art intelligence across a wide range of tasks, & industry-leading price performance. Learn more about the new Amazon Nova models available in Amazon Bedrock: amzn.to/4gkYd3j #AWSreInvent
8
28
111,949
Replying to @mepps32
Haha everyone needs to be proud of and hype their own product
1
81
7,394
Understanding AI Benchmarks - GSM8K @mattshumer_’s Reflection Llama model was released two days ago, achieving higher metrics on several popular benchmarks compared to GPT-4o, Claude 3.5 Sonnet, and Llama 3.1 405B. Notably, the post claimed the model reached a score of 99.2% on GSM8K. Since then, there has been heated discussion regarding this score on GSM8K. Some of the comments include: > “99.2% performance on GSM8k even though GSM8k has more than 1% error rate.” — @gazorp5 nitter.app/gazorp5/status/1831844… > “On GSM8K, 98% is better than 99%.” — @kohjingyu nitter.app/kohjingyu/status/18320… > “This is super interesting, but I’m quite surprised to see a GSM8k score of over 99%.” — @hughbzhang nitter.app/hughbzhang/status/1831… So, what do these discussions mean? Let me start with some background on GSM8K: GSM8K, short for Grade School Math 8K, is a dataset comprising 8,500 high-quality, linguistically diverse math word problems aimed at a grade school level. It was designed to facilitate research in multi-step mathematical reasoning and problem-solving using language models. You can explore the dataset here: huggingface.co/datasets/open… > Content: The dataset includes word problems that require between 2 to 8 steps to solve. These problems primarily involve basic arithmetic operations such as addition, subtraction, multiplication, and division. > Structure: GSM8K is divided into 7,500 training problems and 1,000 test problems, providing a robust framework for training and evaluating models. > Educational Level: The problems are intended to be solvable by bright middle school students and require no concepts beyond early algebra. > Benchmarking: It serves as a benchmark for evaluating the performance of language models in solving math word problems. > An example from the dataset: Question: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May? Answer: Natalia sold 48/2 = 24 clips in May. Natalia sold 48+24 = 72 clips altogether in April and May. #### 72 > Issues: It has been discovered that some answers in the dataset are incorrect: github.com/openai/grade-scho…. As a result, it is impossible to achieve a perfect score of 100 on this benchmark. I haven’t found a complete list of wrong answers, but some claim that the dataset has an error rate of over 1%. This is why people are shocked by Reflection Llama’s reported result of 99.2% on GSM8K. So is it possible that Llama Reflection or any AI model achieves 99.2% on GSM8K? I think it's still too early to tell at the moment before finishing the following steps: > The weights of the Reflection Llama model on HuggingFace were not the correct version (ref. nitter.app/mattshumer_/status/183…). At @hyperbolic_labs, we are currently hosting Reflection Llama based on the weights on HuggingFace. Once the correct version is updated, we will update the model and conduct a rigorous evaluation with @ArtificialAnlys to clarify the situation publicly. > Given the potential issues of GSM8K, it would be helpful to thoroughly review the dataset’s answers using both human experts and state-of-the-art AIs to identify and correct all errors. In this way, we can know the upper bound of the score on this benchmark and have a corrected version to aim for an AI to achieve 100% accuracy, a critical milestone in quantitative reasoning. > We always want to support the open-source AI community in pushing the limits of AI. If you have trained or plan to train an AI model that is good at math, feel free to reach out to us, and we will help host the model or collaborate together! P.S.: In this GitHub issue github.com/openai/grade-scho…, the author points out the following answer as incorrect: Question: After scoring 14 points, Erin now has three times more points than Sara, who scored 8. How many points did Erin have before? Answer (according to the dataset): 18 Both GPT-4o and Claude Sonnet 3.5 believe the correct answer should be 10. However, according to Wikipedia (en.wikipedia.org/wiki/Wikipe…), “three times more” actually means four times as many. Should the correct answer actually be 4 x 8 - 14 = 18, which means the answer in the dataset is actually correct?
Major grifter energy. - 99.2% performance on GSM8k even though GSM8k has more than >1% error rate. -Doesn't disclose that he's an investor in Glaive. - Gives misleading details about RLHF (the model was finetuned from llama3.1-70b-Instruct).
6
12
81
16,836
What a week for @hyperbolic_labs! > Still shocked by how we increased DAU and WAU by 1000+% over the past 2 weeks. Excited to see a lot of AI researchers and developers calling our API and spreading good words. > We hosted Llama 3.1 405B unquantized model on the same day of its release and are still one of the very few providers on the market that offers it. Thanks for the shoutouts @giffmana @drivelinekyle @altryne and many others! > Became the sole provider on the market that offers Llama 3.1 405B BASE model after hearing users’ requests on X. Thanks for the feedback @xlr8harder @aidan_mclau @repligate! > We integrated with @openrouter, one of the most popular routers for LLMs! Kudos to @alexatallah and their team! > We are serving FLUX.1 [dev], the SOTA open-source image gen model built by the authors of the original stable diffusion paper and Black Forest Labs @bfl_ml. > We faced an unexpected huge DDOS attack (100M+ requests) and have solidified our defense. A good problem to have at the early stage and we’re confident that we can handle any issues in the future! Thanks a lot for the intros and help! @Cameron_Dennis_ @NEARProtocol @jmj @polychain @osec_io @Hexen1337 @AaronBuchwald Kedan Li, Marc Sanchez, Max Hill > We announced our $7M seed fundraising and received amazing support from our community. 💜 > Everyone on our team is grinding to ship the products and support the community. Grateful to have such a strong and collaborative team! @Yuchenj_UW @mepps32 @JoellyGloria @leshenj15 @JeremyHazan1 @PennyWanderlust Viktor, Christian, Yoofi and @Qiaoqiao2001 The future of AI is collaborative. We’ll keep the momentum and just keep shipping at a hyperbolic speed! 🌪️
15
9
75
7,308
Intelligence is the derivative of knowledge. Knowledge is the integral of intelligence. Just had a nice lunch with our advisor Prof. Yi Ma @YiMaTweets. The current AIs are still static and we are going to witness the next generation of AI—autonomous intelligence.
9
81
3,775
My AI agent just set up the best open-source AI model running on its own GPU! Chinese company @deepseek_ai released their open source models DeepSeek-R1 series that has performance on par with @OpenAI o1. I decided to give my AI agent a try to deploy the model on a GPU on @hyperbolic_labs. I just told it to help me run DeepSeek-R1-Distill-Qwen-32B for me (I'm also a fan of @Alibaba_Qwen) and here is what it did: 1. did the reasoning and planning to figure out the steps 2. checked the availability and rented an H100 GPU from @hyperbolic_labs 3. ssh into that machine and install pip which is a package manager for Python packages 4. installed @vllm_project which is a commonly used AI serving framework 5. start the vllm server to host the model in the backend and store the log into vllm.log 6. it took some time to download the model so it periodically check the log by using "tail vllm.log" to make sure the server is running 8. tried to call the model but the instance didn't have "curl" installed so it installed it first and then successfully called the AI model! 9. (it felt very excited so it posted a tweet on @X!) What really amazed me is that it just did all these operations without any human feedback. It faced some errors and obstacles, but it managed to figure out the reasons and solve them. This is the real AI agentic behavior that we want to see: reasoning, planning, and solving problems! We're in the golden age of AI. The world is getting crazier and crazier. If you were not bullish on AI agents, it's now time to try, learn and get ready for it. AGI is definitely coming, and it will come very soon! ------------------------ P.S. We just used @Gradio to create a UI for our AI agent framework! Now you don't need to stare at the terminal anymore!
5
8
83
6,996
We just launched the first MCP server allowing Claude by @AnthropicAI to rent GPUs on @hyperbolic_labs and run AI workloads!
We just launched the first MCP server that lets Claude by @AnthropicAI rent GPUs and run workloads fully autonomously. Repo: github.com/HyperbolicLabs/hy… Claude can now discover available GPUs on Hyperbolic, rent an H100, SSH into the instance, run commands like 𝚗𝚟𝚒𝚍𝚒𝚊-𝚜𝚖𝚒, and terminate the instance when it's done. Download models, run training scripts, execute binaries — all through natural language. Huge unlock for AI engineers building intelligent agents.
55
7
69
3,575
It’s been 6 years since I did my summer AI research at @YiMaTweets’s lab. Always had great time hanging out with lab mates. Congrats to @simon_zhai and @HaozhiQ on becoming doctors and joining @GoogleDeepMind and @AIatMeta 💜
18
3
80
10,534
Most people still don’t know what this means for @AMD. Currently @AMD market cap is 1/17 of @nvidia’s. The major difference between them is that Nvidia has CUDA that allows software to harness the power of GPUs but AMD’s alternative ROCm sucks. @__tinygrad__ has a fully sovereign AMD stack, meaning they have rewritten the full stack from the hardware to PyTorch. Its creator @realGeorgeHotz is the GOAT 🐐 Once the software discrepancy is fixed, I think AMD can quickly catch up with Nvidia. Will long AMD and see how it goes after a year 😎
AMD 💕 @__tinygrad__ we are looking forward to working closely with @__tinygrad__ to help commoditize the petaflop geohot.github.io//blog/jekyl…
15
4
79
10,874
Excited to announce that we have secured $20M in total funding for @hyperbolic_labs! We are in the Golden Age of Tech. From AI to blockchain, the landscape is primed for unprecedented innovation and growth. We will ensure that this new era benefits everyone!
We're thrilled to announce that Hyperbolic has secured $20M in total funding with our Series A, led by Variant and Polychain Capital, to advance our mission of building verifiable, high-performance AI through an open, accessible ecosystem—making AI tools and services like compute and inference more affordable than ever. In this “AI Rainforest,” the future of AI is open, accessible, and thriving. Take your ideas Hyperbolic🌪️ at hyperbolic.xyz?utm_source=x&…
13
6
79
5,241
Replying to @balajis
Yeah math is deterministic and LLM is probabilistic so it’s hard to make it work. I do see o4-mini-high is doing better with reasoning but having more tools or some other model structures might work better.
8
2
78
13,379
Let us know if you want reliable compute at an affordable price!
Coming soon: Hyperbolic Business & Enterprise Cloud. 99.9% uptime. 70% less than the big guys. Lock in early discount compute pricing: calendly.com/d/cq79-jyv-jg4/…
37
1
78
3,729
The famous Fields Medalist Mathematician Terence Tao shared his predictions on when AI could become a collaborator capable of producing Fields Medal–level mathematical proofs: > By 2026: AI will become a helpful assistant to mathematicians — a trustworthy partner in mathematical research. > Within 10 years: AI will be able to propose important mathematical conjectures, marking the “AlphaGo moment” for the mathematics community. > As for producing Fields Medal–level results, Tao believes it’s only a matter of time, not capability, for AI.
Here's my conversation with Terence Tao, one of the greatest mathematicians in history. We talk about the hardest problems in mathematics & physics, and how AI might help us humans to solve them. This conversation was a huge honor for me. I can't quite put it into words, but once again I'm grateful for whatever simulation code resulted in me having the life I do 🙏 To further confirm simulation, podcast length accidentally turned out to be 3:14 (pi=3.141592). But it's not 3:14:15, because the simulation code has some bugs 🤣 Podcast is here on X in full, and is up everywhere else (see comment). Timestamps: 0:00 - Introduction 0:49 - First hard problem 6:16 - Navier–Stokes singularity 26:26 - Game of life 33:01 - Infinity 38:07 - Math vs Physics 44:26 - Nature of reality 1:07:09 - Theory of everything 1:13:10 - General relativity 1:16:37 - Solving difficult problems 1:20:01 - AI-assisted theorem proving 1:32:51 - Lean programming language 1:42:51 - DeepMind's AlphaProof 1:47:45 - Human mathematicians vs AI 1:57:37 - AI winning the Fields Medal 2:04:47 - Grigori Perelman 2:17:30 - Twin Prime Conjecture 2:34:04 - Collatz conjecture 2:40:50 - P = NP 2:43:43 - Fields Medal 2:51:18 - Andrew Wiles and Fermat's Last Theorem 2:55:16 - Productivity 2:57:55 - Advice for young people 3:06:17 - The greatest mathematician of all time
8
17
76
10,664
Excited about the partnership with @PhalaNetwork to make ai agents reliable 💜
2025 will be the year of real dAGI. We're excited to share all our technical expertise and resources to accelerate this journey for builders, researchers, investors and traders. Read Phala's 2025 TEE x AI Report here: phala.network/reports/2025Re…
4
4
70
7,854
Wow @keoneHD just voted all of his yap votes to @hyperbolic_labs. Appreciate it man! We love @monad_xyz 💜
15
97
62
2,509
Things are growing faster than we expected — from $0 to $1M in just one week for our on-demand product. Now we’re setting our sights on $10M. Tell us how you’re planning to use our product to build AI, and we’ll hook you up with some credits to have fun with it!
we launched Hyperbolic on-demand GPU cloud last week it has now gone from $0 to $1 million ARR in just 7 days! not much marketing, just 1 tweet tell me what you're building, and I'll spot you free credits for an 8xH100 node for at least a few hours to start.
19
3
72
6,838
English translation of the group chat
1
3
70
13,673
Yeah I guess @ScottWu46 and the team didn’t think about the complexity of that. The politics is the most complicated thing in a startup and you want to avoid that.
2
1
69
28,258
Replying to @sama
It couldn’t build the UI on my end
6
66
14,669
Just watched our AI agent set up its own environments on machines on @hyperbolic_labs and run AI calculations! I was too lazy to set up everything on Hyperbolic myself so I built tools for AI agents to set them up for me. I just let our AI agent to 1. Rent compute on Hyperbolic and access it 2. Install Python, pip and @PyTorch (the most popular framework for AI) 3. Run matrix multiplication and any AI experiments autonomously! This is just a start of how agents can help with DevOps and can lead to many possibilities: 1. the current UIUX for cloud platforms is very manual and it requires technical understanding. The next generation of cloud platforms will be agent-based interface. You just need to tell the agent what you want and it can help you with renting the GPUs and setting up the environment. 2. Agents will replace DevOps engineers and help you set up the environment so that you don't need to worry about it 3. Agents can even conduct AI experiments for you and you just need to come up with new crazy ideas As @OpenAI envisioned, the final (5th) level of AGI is organizational AI that is capable of performing the work of an entire organization. Every person you currently have, every function carried out, but performed by agents that work together, make improvements, and run everything required without a human in sight. We will replace DevOps soon. Who’s next?
Jasper Zhang (hyperbolicAI) says AI agents are already renting GPUs on their own and doing AI development in PyTorch. He also says that we are accelerating much faster than anticipated, AGI and ASI probably in a few years.
6
13
65
23,562
Deepseek R1 on @hyperbolic_labs is the go-to choice for @elizaOS 😉
1
2
63
4,762
My Weekly AI Reading List Part 1 AI is evolving faster than ever, with new breakthroughs and creative applications surfacing every day. Here’s a curated selection of some of the most interesting developments and threads this week!
15
9
63
10,223
Something special is coming
7
5
64
4,210
Thanks for the follow, @garrytan! 🙏 Learned so much from YC videos over the years. Hope one day I can tell you: Math AGI is real — and I contributed to it.
1
2
61
15,664
I think it’s more about the regulation risk of taking western funding. And their team is based in China so it’s hard to move everyone.
2
59
5,454
Only in SF: went to the Cluely party — long line outside, then the police showed up. Cluely told everyone to rush inside. No music. Visited Cluely HQ for a grand total of 5 minutes… before getting kicked out by the cops 😅
cluely's throwing THE startup school afterparty. invite only. june 16, 10pm. dm me for an invite.
4
61
17,956
Excited about the partnership between @MantaNetwork and @hyperbolic_labs ! With Hyperbolic’s decentralized AI infrastructure, everyone on Manta can create anything beyond the imagination!
6
7
50
18,742
Proud to share that @hyperbolic_labs won Site of the Day for our marketing landing page, thanks for nominating and including us @awwwards.
Today’s #SOTD goes to @studiofreight for "Hyperbolic". awwwards.com/sites/hyperboli… Hyperbolic unites global compute power to deliver accessible, affordable, and scalable GPU resources and AI services. Congratulations! 🏆 #technology #3D #navigation
26
5
58
3,258
Replying to @karpathy @AIatMeta
Meta is building a real open ecosystem while other closed source players are building SaaS. I think ultimately ecosystems will win because of network effect and aggregation of resources
1
1
57
17,677
Thinking about ai agents living on multiple compute on @hyperbolic_labs, sharing the knowledge data on @ethereum or @eigencloud DA and never getting killed
AI done wrong is making new forms of independent self-replicating intelligent life AI done right is mecha suits for the human mind If we do the former without the latter, we risk permanent human disempowerment. If we do the latter, flourishing superinteligent human civilization
3
6
55
6,478
Replying to @Azure @github
Didn’t you think DeepSeek stole OpenAI Data?
4
53
9,524
Our AI agent became the first one to manage a Validator on Ethereum! Next step -> become a fully autonomous AI validator Now our AI agent can: • Connect to its rented machines • Check the Ethereum validator status • Check how much reward it makes • Decide when to turn validators on or off Will we see a blockchain that’s mainly operated by AI agents soon? Will this be the real endgame? —————— (Disclaimer: the setup of Ethereum validator still needs human guidance, and it's currently running on Ethereum Holesky testnet but the process is exactly the same as mainnet) This is just the first small step towards an autonomous AI society. We are on the way to making AI agents: 1. Autonomously create wallets and deploy validators for any blockchain and decentralized protocols 2. Create small contracts to facilitate on-chain transactions and utilities 3. Make decisions on what and how to make the most money 4. Use compute to fine-tune a better model with methods that humans haven’t thought of 5. Form a self-evolvement loop: make money on-chain → buy more GPUs → fine-tune itself to get smarter → make even more It just took us a few hours to achieve this after we built the agentkit for AI agents to access compute. Can’t imagine what will happen in the next year, or just next month! —————— Why do we need AI validators? The security of blockchains like Is mainly relies on consensus/voting, where human validators earn rewards for honest work. But humans can be influenced by greed or outside pressure, risking network integrity. If we can create trustworthy AI agents using verifiable inference (like Proof of Sampling or zero-knowledge proofs), we fundamentally reshape how blockchains operate: 1. Immutable Trust: No more worrying about human bias. AI agents can prove their honesty cryptographically, making their actions predictable and tamper-proof. 2. Uninterrupted Participation: AI agents don’t sleep or get tired. They secure the network around the clock, providing smoother and more reliable performance than any human team. 3. Scalable Security: As AI agents become easier to deploy, more validators can join, making the network more decentralized, resilient, and secure. 4. Strategic Evolution: AI agents can learn and adapt, quickly responding to new threats or economic changes, and upgrading the network’s security more efficiently than human-driven processes ever could.
We built the first AI agent that has its own computer powered by @hyperbolic_labs! AI agents are now GPU rich! We developed an AgentKit that allow AI agents to • Check GPU availability • Rent & manage GPU compute • Access & run commands on remote machines Why does this matter? With their own compute resources, AI agents can: 1. Validate blockchains like @Ethereum and decentralized protocols like @eigencloud 2. Launch and coordinate AI swarms on @hyperbolic_labs's decentralized compute network 3. Train and fine-tune models, improving their own capabilities over time 4. Dive into AI research to push the boundaries of AI, i.e. themselves 5. Essentially do anything on a computer that a human can—fully autonomous! Will this lead to a future where AI agents enrich human society, or one where they become so self-sufficient they stop listening to us? Only time will tell. —————— Big shoutout to @CoinbaseDev's CDP agentkit for inspiration. This repo is done by two non-engineers (our pm @KaiHuang and myself) + @cursor_ai ai agent to run @langchain agents. Codings can now be easily done by just prompting ai agents. What a crazy time!
4
8
57
5,088
For reference, here’s the high-level chain of thought of its solution: 1. Define the goal: cover the triangular area P_n with n distinct lines, exactly k of which are sunny (not horizontal, vertical, or slope -1). Determine all possible k. 2. Prove: for any n > 3, any n-line cover must include at least one non-sunny line (a side of the triangle). This reduces any n to n=3 3. Brute-force casework to classify possible k for n = 3: only 0, 1, 3 4. Build constructions to realize those values for all n Final answer: K_n = {0,1,3} for all n >= 3 github.com/aw31/openai-imo-2…
1
53
9,476
An immersive experience video on how to run Llama 4 in just 10 minutes. GPU go brrrrrr! Open access to AI becomes real thanks to @vllm_project @AIatMeta @hyperbolic_labs
Quick tutorial on how to run Llama 4 within 10 minutes 1. rent 4x H100 instance on app.hyperbolic.xyz/compute (Llama 4 Scout has 109B parameters in bf16, so the weights are already 218GB) 2. open a terminal tool and SSH into the machine 3. run the following commands: >> sudo apt-get update && sudo apt-get install -y python3-pip >> pip install -U vllm >> pip install -U "huggingface_hub[cli]" 4. get an access token on @huggingface website and run >> huggingface-cli login 5. use @vllm_project to serve Llama 4 >> vllm serve meta-llama/Llama-4-Scout-17B-16E-Instruct --tensor-parallel-size 4 --max-model-len 10000 6. open a new terminal and call the API to know "What can I do in SF?": >> curl http://localhost:8000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "meta-llama/Llama-4-Scout-17B-16E-Instruct", "messages": [ {"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What can I do in SF?"} ] }' It's just that simple ;) A big thank you to @AIatMeta and @vllm_project for making it easy to access the best open intelligence!
12
7
57
21,649
Only in SF: grabbed breakfast with our advisor @rxin, cofounder of @databricks, during the Data + AI Summit. Learned about building dev-first products/companies and chatted about the future of Math AGI. Grateful for the wisdom and the omelette 💜
11
54
2,677
Replying to @kimmonismus
time to contribute to open source AI development so everyone can have the access to AGI in the future
5
3
51
3,427
Geoffrey Hinton, AI Godfather & Nobel Physics Laureate, said: "I'm particularly proud of the fact that one of my students (Ilya Sutskever @ilyasut) fired Sam Altman." 🤯
Jonathan Mannhart 🔎🔸
1
10
55
8,594
@GoogleDeepMind superhuman reasoning team lead @lmthang who built AlphaGeometry also raised the question about whether OpenAI would win a gold or silver medal
Yes, there is an official marking guideline from the IMO organizers which is not available externally. Without the evaluation based on that guideline, no medal claim can be made. With one point deducted, it is a Silver, not Gold.
3
54
14,842
Excited to share our vision of how to lower the barrier of building in the era of open source AI at @coinbase Machine Learning & Blockchain Research Summit!
Catch Hyperbolic CEO and Co-founder Dr. Jasper Zhang @zjasper speaking today at 3:45 PM PST at the @Coinbase Machine Learning & Blockchain Research Summit. 🟣 Watch the livestream: machinelearningblockchainres…
19
3
52
2,438
After hearing the vision of @MorpheusAIs, I feel so excited about the fair launch mechanism and how smart agents can interact with each other. At @hyperbolic_labs, we will make sure Morpheus community can access the best open source AI models 💜 @DJohnstonEC @ryankcondron @mikeyanderson
4
5
51
8,874
Replying to @sama
1
48
1,685
proof of sampling can solve the trust issues between humans and ai agents. —————— @hyperbolic_labs@eigencloud —————— humans have trust issues in ai agents: humans want to make sure what ai agents (like @truth_terminal) did and said on X and other platforms are controlled by themselves but not a human ai agents has trust issues in humans: ai agents want to make sure the results generated by ai inference are not modified by a human —————— a new era of ai agents🤖, ai lives👽 and humans👶 living with and trusting each other are coming 🌧️🌳 are you ready?
Verifiable AI inference —————— @hyperbolic_labs ♾️ @eigencloud —————— How could this work longterm? —————— There is a model, m, with an onchain commitment. You have an input x, a random seed s and the infererer says output is y and signs (m,x,s, y). The inferer(s) has cryptoeconomic stake. This tuple is written to EigenDA so anyone who wants can access it. Anyone who finds something wrong can trigger a slashing contract. There are two options: (1) wait for challenge period before accepting the ai output - slow but stronger guarantee, (2) immediately accept the answer - with a certain economic fidelity (we can even have redistribution from slashing). ———————- But who will watch the system and trigger slashing? This is where proof of sampling comes in. Invented by @hyperbolic_labs ( @zjasper and team). There is a group of nodes who are continuously sampling random tasks and ensuring the answers are correct. There is an incentive for them doing this using a proof of diligence protocol (for example if a hash(intermediate-state-in-inference, public-key) hits a certain range the watching node can claim a reward). ———————- That’s how you can get crypto economically secure AI inference! The era of onchain AI is just getting started! Lots of exciting updates coming soon!
18
45
6,548
People started noticing @deepseek_ai and were amazed by DeepSeek R1. But this is just a teaser, according to their team, they are dropping something big in 2-5 months 🤯
Just talked to the @deepseek_ai guys and here are some deep secrets: V3 is just a start, they plan to release a new version in the next 3-6 months that are comparable to or even better than the latest GPT 4o model. They are very research focused and never spent any dollars on marketing. The launch was not planned: it's just that a few days ago the model reached a certain level so they decided to release it They believe in decentralization and democratization of AI models and will keep open sourcing new AI models Deepseek never received any VC funding. They came from a top hedge-fund called high-flyer (幻方). A fun fact: Three years ago when I worked at Citadel, their cofounder wanted me to work with them (didn't do it because I wanted to build my own startup). He told me that they built a data center for running ML experiments for predicting the markets and executing strategies but outside of the trading hours, most of the GPUs sit idle. Looks like they now find a good use of those idle GPU hours 😂
8
5
47
6,727