Interconnects · Aug 7, 2024 · 1:05 PM UTC

Interconnects

Interconnects

@interconnectsai

7 Aug 2024

A recipe for frontier model post-training Apple, Meta, and Nvidia all agree — synthetic data, iterative training, human preference labels, and lots of filtering. interconnects.ai/p/frontier-…

A recipe for frontier model post-training

Apple, Meta, and Nvidia all agree — synthetic data, iterative training, human preference labels, and lots of filtering.

interconnects.ai

170

83,512

Interconnects · Dec 4, 2024 · 3:26 PM UTC

Interconnects

@interconnectsai

4 Dec 2024

OpenAI's o1 using "search" was a PSYOP How to understand OpenAI's o1 models as really just one wacky, wonderful, long chain of thought. interconnects.ai/p/openais-o…

OpenAI's o1 using "search" was a PSYOP

How to understand OpenAI's o1 models as really just one wacky, wonderful, long chain of thought

interconnects.ai

165

48,961

Interconnects · Nov 29, 2023 · 3:59 PM UTC

Interconnects

@interconnectsai

29 Nov 2023

Synthetic data: Anthropic’s CAI, from fine-tuning to pretraining, OpenAI’s Superalignment, tips, types, and open examples Synthetic data is the accelerator of the next phase of AI — what it is and what it means. interconnects.ai/p/llm-synth…

Synthetic data: Anthropic’s CAI, scaling, OpenAI’s Superalignment, tips, and open-source examples

Synthetic data is the accelerator of the next phase of AI — what it is and what it means.

interconnects.ai

121

93,054

Interconnects · Sep 16, 2024 · 2:09 PM UTC

Interconnects

@interconnectsai

16 Sep 2024

Reverse engineering OpenAI’s o1 What productionizing test-time compute shows us about the future of AI. Exploration has landed in language model training. interconnects.ai/p/reverse-e…

Reverse engineering OpenAI’s o1

What productionizing test-time compute shows us about the future of AI. Exploration has landed in language model training.

interconnects.ai

116

39,371

Interconnects · Aug 17, 2025 · 3:43 PM UTC

Interconnects

@interconnectsai

17 Aug 2025

China's Top 19 Open Model Labs We ranked all the organizations in China releasing open models, from the top of DeepSeek to small, newer academic labs making waves with tech reports and niche models. interconnects.ai/p/chinas-to…

Ranking the Chinese Open Model Builders

From the obvious names to those to keep an eye on.

interconnects.ai

118

201,650

Interconnects · Apr 19, 2025 · 4:14 PM UTC

Interconnects

@interconnectsai

19 Apr 2025

OpenAI's o3: Over-optimization is back and weirder than ever Tools, true rewards, and a new direction for language models. interconnects.ai/p/openais-o…

OpenAI's o3: Over-optimization is back and weirder than ever

Tools, true rewards, and a new direction for language models.

interconnects.ai

38,575

Interconnects · Feb 13, 2025 · 3:45 PM UTC

Interconnects

@interconnectsai

13 Feb 2025

An unexpected RL Renaissance New talk! Forecasting the Alpaca moment for reasoning models and why the new style of RL training is a far bigger deal than the emergence of RLHF. YouTube: piped.video/watch?v=YXTYbr3h… Slides: docs.google.com/presentation… More info: interconnects.ai/p/an-unexpe…

An Unexpected Reinforcement Learning Renaissance

The era we are living through in language modeling research is one ...

youtube.com

45,685

Interconnects · Sep 5, 2024 · 2:25 PM UTC

Interconnects

@interconnectsai

5 Sep 2024

OpenAI’s Strawberry, LM self-talk, inference scaling laws, and spending more on inference Whether or not scaling works, we should spend more on inference. interconnects.ai/p/openai-st…

OpenAI’s Strawberry and inference scaling laws

OpenAI’s Strawberry, LM self-talk, inference scaling laws, and spending more on inference. Coming waves in LLMs.

interconnects.ai

72,700

Interconnects · Jan 2, 2025 · 4:11 PM UTC

Interconnects

@interconnectsai

2 Jan 2025

Quick recap on the state of reasoning -- can LMs reason? How? My talk at the NeurIPS Latent Space live event (pre o3). Slides: docs.google.com/presentation… Post: interconnects.ai/p/the-state… YouTube: piped.video/2pHE9L4ZZXM?si=vM9-…

[12112024, Latent Space @ NeurIPS] Reasoning

The state of reasoning Nathan Lambert Ai2 // Interconnects.ai Latent Space // NeurIPS 2024 Lambert | Thoughts on reasoning 1

docs.google.com

32,351

Interconnects · Jul 23, 2025 · 6:40 PM UTC

Interconnects

@interconnectsai

23 Jul 2025

This also means you can write off your Interconnects AI subscription. Not official tax advice.

christian

@curious_vii

23 Jul 2025

wow - $5k tax free for ai retooling

4,697

Interconnects · Aug 7, 2025 · 3:12 PM UTC

Interconnects

@interconnectsai

7 Aug 2025

We're all excited about the GPT-5 release. Here's a fun game for while you watch. Potential prizes coming later! Livestream links coming soon.

31,961

Interconnects · Sep 11, 2024 · 2:19 PM UTC

Interconnects

@interconnectsai

11 Sep 2024

Futures of the data foundry business model Scale AI’s future versus further scaling of language model performance. How Nvidia may take all the margins from the data market, too. interconnects.ai/p/ai-data-f…

Futures of the data foundry business model

Scale AI’s future versus further scaling of language model performance. How Nvidia may take all the margins from the data market, too.

interconnects.ai

107,922

Interconnects · Jan 8, 2025 · 3:38 PM UTC

Interconnects

@interconnectsai

8 Jan 2025

The state of post-training in 2025 A re-record of my NeurIPS tutorial on language modeling (plus some added content). Blog + extra context: interconnects.ai/p/the-state… YouTube: piped.video/6yIMb0K-aS4 Slides: docs.google.com/presentation…

The state of post-training in 2025

Watch now (54 mins) | A re-record of my NeurIPS tutorial on language modeling (plus some added content).

interconnects.ai

52,731

Interconnects · Jan 29, 2024 · 4:34 PM UTC

Interconnects

@interconnectsai

29 Jan 2024

Model merging lessons in The Waifu Research Department When what seems like pure LLM black magic is actually supported by the literature. interconnects.ai/p/model-mer…

Model merging lessons in The Waifu Research Department

When what seems like pure LLM black magic is actually supported by the literature.

interconnects.ai

32,381

Interconnects · Jun 28, 2025 · 10:15 PM UTC

Interconnects

@interconnectsai

28 Jun 2025

In the last ~6 months more closely analyzing the open models and datasets of note across the community on @huggingface, we've highlighted artifacts from 141 different organizations. It takes many people to build the open ecosystem for AI. ACE-Step AI-MO AIDC-AI ASLP-lab Alpha-VLLM AtlaAI BAAI (4) BLIP3o ByteDance (3) ByteDance-Seed (4) CYFRAGOVPL CohereLabs (5) DataoceanAI DatologyAI (2) Datou1111 Etched EuroBERT Freepik (2) GSAI-ML Goedel-LM Hcompany HelloKKMe HiDream-ai HuggingFaceTB (3) ICTNLP JetBrains LGAI-EXAONE LLM360 (2) MiniMaxAI (2) NX-AI NexaAIDev Nexusflow NousResearch NovaSky-AI (2) Open-Reasoner-Zero OpenGVLab OpenPipe OuteAI POLARIS-Project PRIME-RL PeterJinGo PlayHT PleIAs (2) PrimeIntellect (2) Qwen (15) RekaAI Salesforce (2) Skywork (6) Snowflake SparkAudio StarJiaxing SultanR THUDM (3) TIGER-Lab UCSC-VLAA UW-Madison-Lee-Lab Wan-AI (3) WisdomShell XiaomiMiMo (2) Xkev Zyphra (2) agentica-org (2) ai21labs all-hands allenai (5) allura-org amd (3) answerdotai apple arcee-ai (6) arcinstitute bespokelabs canopylabs cl-nagoya convergence-ai deepcogito deepseek-ai (9) ds4sd echo840 facebook (3) fdtn-ai featherless-ai fishaudio genmo google (9) haizelabs hexgrad hkust-nlp hpcai-tech ibm-granite (9) inclusionAI (4) infgrad internlm (5) kuleshov-group kyutai (4) laion (2) lerobot lightonai m-a-p marin-community maya-multimodal meta-llama (2) metagene-ai microsoft (7) mistralai (6) mixedbread-ai mobiuslabsgmbh moonshotai (4) nanonets nllg nomic-ai (3) nvidia (18) open-r1 open-thoughts openbmb (3) opencompass osmosis-ai ostris perplexity-ai qihoo360 rednote-hilab reducto rhymes-ai (3) ruliad sand-ai sarvamai sesame si-community simplescaling stabilityai (2) stepfun-ai (4) tencent (3) thomas-sounack tiiuae (2) tngtech tomg-group-umd vidore (2) vikhyatk xlr8harder yentinglin zed-industries

5,884

Interconnects · Mar 28, 2024 · 12:09 AM UTC

Interconnects

@interconnectsai

28 Mar 2024

DBRX: The new best open model and Databricks’ ML strategy Databricks’ new model is surpassing the performance of Mixtral and Llama 2 70B while still being in a size category that's reasonably accessible. interconnects.ai/p/databrick…

DBRX: The new best open model and Databricks’ ML strategy

Databricks’ new model is surpassing the performance of Mixtral and Llama 2 70B while still being in a size category that's reasonably accessible.

interconnects.ai

21,599

Interconnects · Jun 28, 2025 · 6:14 PM UTC

Interconnects

@interconnectsai

28 Jun 2025

Ilya on deep learning in 2015 On vision and how to understand deep learning. interconnects.ai/p/ilya-on-d…

5,515

Interconnects · Jun 26, 2024 · 3:04 PM UTC

Interconnects

@interconnectsai

26 Jun 2024

RLHF roundup: Getting good at PPO, sketching RLHF’s impact, RewardBench retrospective, and a reward model competition Things to be aware of if you work on language model fine-tuning. interconnects.ai/p/rlhf-roun…

RLHF roundup: Getting good at PPO, charting RLHF’s impact, RewardBench retrospective, and a reward...

Things to be aware of if you work on language model fine-tuning.

interconnects.ai

12,660

Interconnects · Aug 7, 2025 · 4:52 PM UTC

Interconnects

@interconnectsai

7 Aug 2025

GPT 5 Launch Party w/ Will Brown & Swyx nitter.app/i/broadcasts/1OyKALjqR…

10,107

Interconnects · Jul 16, 2025 · 5:38 PM UTC

Interconnects

@interconnectsai

16 Jul 2025

As Meta tries to race to build a great new research lab, we wanted to remind everyone that the organization structure is just as much of a challenge as the personnel. Here are our takeaways from earlier in the year. Rec's first.

3,580

Interconnects · Mar 12, 2025 · 2:09 PM UTC

Interconnects

@interconnectsai

12 Mar 2025

Interviewing Eugene Vinitsky (@EugeneVinitsky) on self-play for self-driving and what else people do with RL #13. Reinforcement learning fundamentals and scaling. interconnects.ai/p/interview…

Interviewing Eugene Vinitsky on self-play for self-driving and what else people do with RL

#13. Reinforcement learning fundamentals and scaling.

interconnects.ai

15,864

Interconnects · Apr 5, 2025 · 4:43 PM UTC

Interconnects

@interconnectsai

5 Apr 2025

RL backlog: OpenAI's many RLs, clarifying distillation, and latent reasoning Notes I forgot to publish. Closing some loose ends in the reasoning model discussions. interconnects.ai/p/rl-backlo…

RL backlog: OpenAI's many RLs, clarifying distillation, and latent reasoning

Notes I forgot to publish. Closing some loose ends in the reasoning model discussions.

interconnects.ai

25,557

Interconnects · Jul 5, 2025 · 9:06 PM UTC

Interconnects

@interconnectsai

5 Jul 2025

We're working on best-available analyses of where open models come from, who uses them, and how much. What questions do you have?

Nathan Lambert

@natolambert

5 Jul 2025

There are like 10-20 Chinese orgs shipping open models that I try and keep a somewhat close eye on and there are like 3-4 in the rest of the world 😳

7,468

Interconnects · Feb 26, 2025 · 3:03 PM UTC

Interconnects

@interconnectsai

26 Feb 2025

Character training: Understanding and crafting a language model's personality Post-training in industry is very different than the academic papers and open-source models demonstrate. interconnects.ai/p/character…

Character training: Understanding and crafting a language model's personality

Post-training in industry is very different than the academic papers and open-source models demonstrate. Let's dive into one of my favorite topics in language modeling development today.

interconnects.ai

16,958

Interconnects · Mar 26, 2025 · 1:53 PM UTC

Interconnects

@interconnectsai

26 Mar 2025

Gemini 2.5 Pro and Google's second chance with AI Plus some coverage for the latest DeepSeek. interconnects.ai/p/gemini-25…

Gemini 2.5 Pro and Google's second chance with AI

The end of a busy spring of model improvements and what's next for the presumed leader in AI abilities.

interconnects.ai

73,539

Interconnects · Dec 18, 2024 · 3:44 PM UTC

Interconnects

@interconnectsai

18 Dec 2024

The AI agent spectrum Separating different classes of AI agents from a long history of reinforcement learning. interconnects.ai/p/the-ai-ag…

The AI Agent Spectrum

Separating different classes of AI agents from a long history of reinforcement learning.

interconnects.ai

23,943

Interconnects · Jul 14, 2025 · 3:17 PM UTC

Interconnects

@interconnectsai

14 Jul 2025

Kimi K2 and when "DeepSeek Moments" become normal One "DeepSeek Moment" wasn't enough for us to wake up, hopefully we don't need a third. interconnects.ai/p/kimi-k2-a…

Kimi K2 and when "DeepSeek Moments" become normal

One "DeepSeek Moment" wasn't enough for us to wake up, hopefully we don't need a third.

interconnects.ai

25,316

Interconnects · Jul 12, 2025 · 2:43 PM UTC

Interconnects

@interconnectsai

12 Jul 2025

Grok 4: An o3 look-alike in search, high highs and new lows An o3 class model, the possibility of progress, chatbot beige, and the illusiveness of taste. interconnects.ai/p/grok-4-an…

xAI's Grok 4: The tension of frontier performance with a side of Elon favoritism

An o3 class model, the possibility of progress, chatbot beige, and the illusiveness of taste.

interconnects.ai

42,606

Interconnects · Jun 4, 2025 · 1:56 PM UTC

Interconnects

@interconnectsai

4 Jun 2025

A taxonomy for next-generation reasoning models Where we've been and where we're going with RLVR. interconnects.ai/p/next-gen-…

A taxonomy for next-generation reasoning models

Where we've been and where we're going with RLVR.

interconnects.ai

10,859

Interconnects · Dec 20, 2024 · 11:39 PM UTC

Interconnects

@interconnectsai

20 Dec 2024

OpenAI's o3: The grand finale of AI in 2024 A step change as influential as the release of GPT-4. Reasoning language models are the current and next big thing. interconnects.ai/p/openais-o…

o3: The grand finale of AI in 2024

A step change as influential as the release of GPT-4. Reasoning language models are the current big thing.

interconnects.ai

22,245

Interconnects · Sep 4, 2024 · 3:12 PM UTC

Interconnects

@interconnectsai

4 Sep 2024

OLMoE and the hidden simplicity in training better foundation models Ai2 released OLMoE, which is probably our “best” model yet relative to its peers, but not much has changed in the process. interconnects.ai/p/olmoe-and…

OLMoE and the hidden simplicity in training better foundation models

Ai2 released OLMoE, which is probably our “best” model yet relative to its peers, but not much has changed in the process.

interconnects.ai

1,597

Interconnects · Mar 5, 2025 · 3:27 PM UTC

Interconnects

@interconnectsai

5 Mar 2025

Where inference-time scaling pushes the market for AI companies Fundamentals emerging downstream from the RL reasoning models. interconnects.ai/p/where-inf…

Where inference-time scaling pushes the market for AI companies

Fundamentals emerging downstream from the RL reasoning models.

interconnects.ai

18,118

Interconnects · Dec 20, 2023 · 3:30 PM UTC

Interconnects

@interconnectsai

20 Dec 2023

State-space LLMs: Do we need Attention? Mamba, StripedHyena, Based, research overload, and the exciting future of many LLM architectures all at once. interconnects.ai/p/llms-beyo…

State-space LLMs: Do we need Attention?

Mamba, StripedHyena, Based, research overload, and the exciting future of many LLM architectures all at once.

interconnects.ai

2,056

Interconnects · May 27, 2025 · 1:26 PM UTC

Interconnects

@interconnectsai

27 May 2025

Claude 4 and Anthropic's bet on code Reasons to be optimistic and pessimistic on Anthropic's future. interconnects.ai/p/claude-4-…

Claude 4 and Anthropic's bet on code

Reasons to be optimistic and pessimistic on Anthropic's future.

interconnects.ai

18,406

Interconnects · Jul 22, 2025 · 12:03 AM UTC

Interconnects

@interconnectsai

22 Jul 2025

Latest open artifacts (#12): Chinese models continue to dominate throughout the summer 🦦 A new flagship Qwen model, Qwen3-235B-A22B-Instruct-2507, and a general rise in ecosystem quality in Artifacts Log 12. interconnects.ai/p/latest-op…

Latest open artifacts (#12): Chinese models continue to dominate throughout the summer 🦦

Artifacts Log 12.

interconnects.ai

11,678

Interconnects · Jan 9, 2025 · 8:57 PM UTC

Interconnects

@interconnectsai

9 Jan 2025

DeepSeek V3 and the actual cost of training frontier AI models The $5M figure for the last training run should not be your basis for how much frontier AI models cost. interconnects.ai/p/deepseek-…

DeepSeek V3 and the cost of frontier AI models

The $5M figure for the last training run should not be your basis for how much frontier AI models cost.

interconnects.ai

16,910

Interconnects · Dec 11, 2024 · 4:02 PM UTC

Interconnects

@interconnectsai

11 Dec 2024

OpenAI's Reinforcement Finetuning and RL for the masses The cherry on Yann LeCun’s cake has finally been realized. interconnects.ai/p/openais-r…

OpenAI's Reinforcement Finetuning and RL for the masses

The cherry on Yann LeCun’s cake has finally been realized.

interconnects.ai

32,299

Interconnects · Nov 22, 2023 · 2:57 PM UTC

Interconnects

@interconnectsai

22 Nov 2023

RLHF progress: Scaling DPO to 70B, DPO vs PPO update, Tülu 2, Zephyr-β, meaningful evaluation, data contamination Huge steps forward in confirming that RLHF can really help you on vibes based evaluation, among many other RLHF analyses. interconnects.ai/p/rlhf-prog…

RLHF progress: Scaling DPO to 70B, DPO vs PPO update, Tülu 2, Zephyr-β, meaningful evaluation, data...

Huge steps forward in confirming that RLHF can really help you on vibes based evaluation, among many other RLHF analyses.

interconnects.ai

18,479

Interconnects · Jul 23, 2025 · 4:10 PM UTC

Interconnects

@interconnectsai

23 Jul 2025

The White House's plan for open models & AI research in the U.S. Thoughts on the new AI Action plan, American DeepSeek, and what comes next. interconnects.ai/p/the-white…

The White House's plan for open models & AI research in the U.S.

Thoughts on the new AI Action plan, American DeepSeek, and what comes next.

interconnects.ai

13,830

Interconnects · Aug 14, 2024 · 1:53 PM UTC

Interconnects

@interconnectsai

14 Aug 2024

Artifacts Log 3: Synthetic math and Magpie datasets, another 1T param model, and many Mistral models Artifacts ~124 and on for the year. (partial $) interconnects.ai/p/artifacts…

Artifacts Log 3: Synthetic math and Magpie datasets, another 1T param model, and many Mistral models

Artifacts ~124 and on for the year.

interconnects.ai

1,926

Interconnects · Jun 6, 2025 · 3:25 PM UTC

Interconnects

@interconnectsai

6 Jun 2025

How I Write And therein how I think. And how AI impacts it. interconnects.ai/p/how-i-wri…

How I Write

Therein how I think, how AI impacts it, and how writing reflects upon AI progress.

interconnects.ai

22,683

Interconnects · Dec 13, 2023 · 6:06 PM UTC

Interconnects

@interconnectsai

13 Dec 2023

Big Tech's LLM evals are just marketing A PSA everyone needs. The importance of a wait and see attitude when it comes to new models, big and small, open and closed. interconnects.ai/p/evals-are…

Big Tech's LLM evals are just marketing

A PSA everyone needs. The importance of a wait and see attitude when it comes to new models, big and small, open and closed.

interconnects.ai

1,487

Interconnects · Oct 25, 2023 · 4:21 PM UTC

Interconnects

@interconnectsai

25 Oct 2023

RLHF lit. review #1 and missing pieces in RLHF: Looking at the difference between two sets -- what rumors say industry leaders are doing with RLHF and what the literature is up to. A new series studying RLHF literature. interconnects.ai/p/rlhf-lit-…

RLHF lit. review #1 and missing pieces in RLHF

Looking at the difference between two sets -- what rumors say industry leaders are doing with RLHF and what the literature is up to. I'm starting my new series studying RLHF literature.

interconnects.ai

10,862

Interconnects · Jan 22, 2025 · 3:58 PM UTC

Interconnects

@interconnectsai

22 Jan 2025

Interviewing OLMo 2 leads: Open secrets of training language models What we have learned and are going to do next. YouTube: piped.video/dS7QI99uJVc Notes + Podcast: interconnects.ai/p/olmo-2-po…

61,660

Interconnects · Aug 10, 2025 · 1:20 PM UTC

Interconnects

@interconnectsai

10 Aug 2025

What I'm reading (#2): More on Kimi K2, how to build a bad research center, Pretraining with RL, and sporks of AGI A quiet summer is all you need. interconnects.ai/p/what-im-r…

What I've been reading (#2): More on Kimi K2, how to build a bad research center, Pretraining with...

A quiet summer is all you need.

interconnects.ai

11,742

Interconnects · Feb 28, 2025 · 4:50 PM UTC

Interconnects

@interconnectsai

28 Feb 2025

GPT-4.5: "Not a frontier model"? OpenAI's latest model raises more questions than answers, but no, the AI bubble isn't popping quite yet. interconnects.ai/p/gpt-45-no…

GPT-4.5: "Not a frontier model"?

OpenAI's latest model raises more questions than answers, but no, the AI bubble isn't popping quite yet.

interconnects.ai

20,409

Interconnects · Mar 30, 2025 · 4:36 PM UTC

Interconnects

@interconnectsai

30 Mar 2025

GPT-4o's images and lessons from native input-output multimodality Hints of a natively multi-modal future. interconnects.ai/p/gpt-4os-i…

GPT 4o's images and lessons from native input-output multimodality

Hints of a natively multi-modal future.

interconnects.ai

18,442

Interconnects · Jul 31, 2024 · 3:17 PM UTC

Interconnects

@interconnectsai

31 Jul 2024

GPT-4o-mini changed ChatBotArena And how to understand Llama 3.1’s results on the community's favorite benchmark. interconnects.ai/p/gpt-4o-mi…

GPT-4o-mini changed ChatBotArena

And how to understand Llama 3.1’s results on the community's favorite benchmark.

interconnects.ai

32,967

Interconnects · Jun 21, 2024 · 4:06 PM UTC

Interconnects

@interconnectsai

21 Jun 2024

Frontiers in synthetic data Trends in synthetic data that I'm watching closely in the leading open and closed models. interconnects.ai/p/frontiers…

Frontiers in synthetic data

Trends in synthetic data that I'm watching closely in the leading open and closed models.

interconnects.ai

8,347

Interconnects · May 4, 2025 · 4:48 PM UTC

Interconnects

@interconnectsai

4 May 2025

Sycophancy and the art of the model GPT-4o-simp, LMArena backlash, and people refusing to understand how messy and crucial RLHF is. interconnects.ai/p/sycophanc…

Sycophancy and the art of the model

GPT-4o-simp, LMArena backlash, and people refusing to understand how messy and crucial RLHF is.

interconnects.ai

12,286

Interconnects · Dec 21, 2023 · 3:23 PM UTC

Interconnects

@interconnectsai

21 Dec 2023

Interviewing Tri Dao and Michael Poli of Together AI on the future of LLM architectures The first Interconnects research interview! We go even further on the promise of state-space models in the emerging LLM market. interconnects.ai/p/interview…

Interviewing Tri Dao and Michael Poli on the future of LLM architectures

Listen now | The first Interconnects research interview! We go even further on the promise of state-space models in the emerging LLM market.

interconnects.ai

7,888

Interconnects · Mar 19, 2025 · 2:00 PM UTC

Interconnects

@interconnectsai

19 Mar 2025

Managing frontier model training organizations (or teams) How do the frontier labs consistently train great models? How can they fail? interconnects.ai/p/how-to-ma…

Managing frontier model training organizations (or teams)

How do the frontier labs consistently train great models? How can they fail?

interconnects.ai

28,313

Interconnects · Jun 12, 2025 · 5:58 PM UTC

Interconnects

@interconnectsai

12 Jun 2025

The rise of reasoning machines And a debate that doesn't warrant repeating. interconnects.ai/p/the-rise-…

The rise of reasoning machines

And a debate that doesn't warrant repeating.

interconnects.ai

13,443

Interconnects · Jul 14, 2023 · 8:52 PM UTC

Interconnects

@interconnectsai

14 Jul 2023

If you’re a student and want to read paid posts, contact @natolambert by email or DM. Happy to provide a base 80%+ discount.

23,446

Interconnects · Apr 18, 2024 · 9:19 PM UTC

Interconnects

@interconnectsai

18 Apr 2024

Llama 3: Scaling open LLMs to AGI Meta shows that scaling won't be a limit for open LLM players in the near future. interconnects.ai/p/llama-3-a…

Llama 3: Scaling open LLMs to AGI

Llama 3 shows that scaling won't be a limit for open LLM progress in the near future.

interconnects.ai

11,159

Interconnects · Mar 20, 2025 · 2:00 PM UTC

Interconnects

@interconnectsai

20 Mar 2025

The latest open artifacts (#8): The return of ~30B models, side effects of OpenAI's proposed DeepSeek ban, and yet another reasoning roundup Artifacts Log 8. Expect this pace to continue until mid summer. interconnects.ai/p/the-lates…

The latest open artifacts (#8): The return of the ~30B models, side effects of OpenAI's proposed...

Artifacts Log 8. Expect this pace to continue until mid summer.

interconnects.ai

4,323

Interconnects · Aug 8, 2024 · 1:06 PM UTC

Interconnects

@interconnectsai

8 Aug 2024

Interviewing Ross Taylor on LLM reasoning, Llama fine-tuning, Galactica, agents Interconnects interview #5. interconnects.ai/p/interview…

Interviewing Ross Taylor on LLM reasoning, Llama fine-tuning, Galactica, agents

Interconnects interview #5.

interconnects.ai

9,673

Interconnects · May 29, 2025 · 1:40 PM UTC

Interconnects

@interconnectsai

29 May 2025

Latest open artifacts (Artifacts Log #10): New DeepSeek R1 0528!, more permissive licenses, everything as a reasoner, and from artifacts to agents interconnects.ai/p/latest-op…

The latest open artifacts (#10): More permissive licenses, everything as a reasoner, and from...

Artifacts Log 10.

interconnects.ai

15,103

Interconnects · Feb 12, 2025 · 3:23 PM UTC

Interconnects

@interconnectsai

12 Feb 2025

Deep Research, information vs. insight, and the nature of science What AI will accelerate in the scientific process, what it cannot do, and how we can prepare for new manners of scientific investigation. interconnects.ai/p/deep-rese…

Deep Research, information vs. insight, and the nature of science

What AI will accelerate in the scientific process, what it cannot do, and how we can prepare for new manners of scientific investigation.

interconnects.ai

9,428

Interconnects · Oct 18, 2023 · 2:11 PM UTC

Interconnects

@interconnectsai

18 Oct 2023

Undoing RLHF and the brittleness of safe LLMs Recent papers show most of the arguments about needing "safety" in releases of open LLM weights are nearly dead in the water. Yes, still release the parameters. Read here: interconnects.ai/p/undoing-r…

Undoing RLHF and the brittleness of safe LLMs

Most of the arguments about "safe" releases of open LLM weights are nearly dead in the water.

interconnects.ai

11,973

Interconnects · Oct 16, 2024 · 3:37 PM UTC

Interconnects

@interconnectsai

16 Oct 2024

Building on evaluation quicksand On the state of evaluation for language models. interconnects.ai/p/building-…

Building on evaluation quicksand

On the state of evaluation for language models.

interconnects.ai

12,742

Interconnects · Nov 7, 2024 · 3:46 PM UTC

Interconnects

@interconnectsai

7 Nov 2024

Interviewing Tim Dettmers (@Tim_Dettmers) on open-source AI: Agents, scaling, quantization and what's next Interconnects interview #10. Catching up with one of the leaders of open-source AI. interconnects.ai/p/tim-dettm…

Interviewing Tim Dettmers on open-source AI: Agents, scaling, quantization and what's next

Listen now | Interconnects interview #10. Catching up with one of the leaders of open-source AI.

interconnects.ai

16,542

Interconnects · Sep 19, 2024 · 1:46 PM UTC

Interconnects

@interconnectsai

19 Sep 2024

Artifacts Log 4: Reflection 70B, o1 on LMSYS, fine-tuning fine-tunes, and speech models The latest open models and datasets. interconnects.ai/p/artifacts…

Artifacts Log 4: Reflection 70B, o1 on LMSYS, fine-tuning fine-tunes, and speech models

The latest open models and datasets.

interconnects.ai

1,677

Interconnects · Apr 15, 2024 · 4:07 PM UTC

Interconnects

@interconnectsai

15 Apr 2024

The end of the “best open LLM” Modeling the compute versus performance tradeoff of many open LLMs. interconnects.ai/p/compute-e…

The end of the “best open LLM”

Modeling the compute versus performance tradeoff of many open LLMs.

interconnects.ai

2,661

Interconnects · Aug 11, 2025 · 5:04 PM UTC

Interconnects

@interconnectsai

11 Aug 2025

Latest open artifacts (#13): The abundance era of open models Mostly thanks to Qwen, but now we're spoiled for choice and winds are shifting. interconnects.ai/p/latest-op…

Latest open artifacts (#13): The abundance era of open models

Mostly thanks to Qwen, but now we're spoiled for choice and winds are shifting.

interconnects.ai

13,258

Interconnects · May 8, 2024 · 8:50 PM UTC

Interconnects

@interconnectsai

8 May 2024

ChatBotArena: The peoples’ LLM evaluation, the future of evaluation, the incentives of evaluation, and gpt2chatbot What the details tell us about the most in-vogue LLM evaluation tool — and the rest of the field. interconnects.ai/p/chatbotar…

ChatBotArena: The peoples’ LLM evaluation, the future of evaluation, the incentives of evaluation,...

What the details tell us about the most in-vogue LLM evaluation tool — and the rest of the field.

interconnects.ai

3,837

Interconnects · May 10, 2024 · 9:07 PM UTC

Interconnects

@interconnectsai

10 May 2024

OpenAI’s Model (behavior) Spec, RLHF transparency, personalization questions Now we will have some grounding for when weird ChatGPT behaviors are intended or side-effects — shrinking the Overton window of RLHF bugs. interconnects.ai/p/openai-rl…

OpenAI’s Model (behavior) Spec, RLHF transparency, and personalization

Now we will have some grounding for when weird ChatGPT behaviors are intended or side-effects — shrinking the Overton window of RLHF bugs.

interconnects.ai

5,412

Interconnects · Jan 10, 2024 · 4:03 PM UTC

Interconnects

@interconnectsai

10 Jan 2024

Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions A sampling of recent happenings in the multimodal space. Be sure to expect more this year. interconnects.ai/p/multimoda…

Multimodal LM roundup: Unified IO 2, inputs and outputs, Gemini, LLaVA-RLHF, and RLHF questions

A sampling of recent happenings in the multimodal space. Be sure to expect more this year.

interconnects.ai

7,122

Interconnects · Oct 30, 2024 · 3:05 PM UTC

Interconnects

@interconnectsai

30 Oct 2024

Why I build open language models Reflections after a year at the Allen Institute for AI and on the battlefields of open-source AI. interconnects.ai/p/why-i-bui…

Why I build open language models

Reflections after a year at the Allen Institute for AI and on the battlefields of open-source AI.

interconnects.ai

8,590

Interconnects · Mar 10, 2025 · 4:37 PM UTC

Interconnects

@interconnectsai

10 Mar 2025

Elicitation, the simplest way to understand post-training An F1 analogy to help understand fast improvements in post-training on top of slow improvements in scaling. buff.ly/XmEa3XN

13,666

Interconnects · Feb 18, 2025 · 8:30 PM UTC

Interconnects

@interconnectsai

18 Feb 2025

Grok 3 and an accelerating AI roadmap Where AI is heading, why 2024 felt slow, and shifting priorities of frontier laboratories. interconnects.ai/p/grok-3-an…

Grok 3 and an accelerating AI roadmap

Where AI is heading, why 2024 felt slow, and shifting priorities of frontier laboratories.

interconnects.ai

10,154

Interconnects · Feb 28, 2024 · 1:56 PM UTC

Interconnects

@interconnectsai

28 Feb 2024

How to cultivate a high-signal AI feed Basic tips on how to assess inbound ML content and cultivate your news feed. interconnects.ai/p/making-a-…

How to cultivate a high-signal AI feed

Basic tips on how to assess inbound ML content and cultivate your news feed.

interconnects.ai

11,759

Interconnects · May 21, 2025 · 1:36 PM UTC

Interconnects

@interconnectsai

21 May 2025

People use AI more than you think And businesses too. The most important trend in AI that gets washed away from between the headlines. interconnects.ai/p/people-us…

People use AI more than you think

And businesses too. The most important trend in AI that gets washed away from between the headlines.

interconnects.ai

7,805

Interconnects · Oct 26, 2023 · 3:20 PM UTC

Interconnects

@interconnectsai

26 Oct 2023

How the Foundation Model Transparency Index Distorts Transparency, by @natolambert SE Gyges @BlancheMinerva @aviskowron (Cross post with @AiEleuther) interconnects.ai/p/fmti-crit…

How the Foundation Model Transparency Index Distorts Transparency

A proper critique of the Foundation Model Transparency Index (FMTI). Plus some thoughts on the ecosystem implications.

interconnects.ai

1,498

Interconnects · Jun 21, 2025 · 3:22 PM UTC

Interconnects

@interconnectsai

21 Jun 2025

What I've been reading (#1) Splitting the links out from the artifacts log models & datasets series. interconnects.ai/p/what-ive-…

What I've been reading (#1)

Splitting the links out from the artifacts log models & datasets series.

interconnects.ai

11,389

Interconnects · May 6, 2025 · 1:42 PM UTC

Interconnects

@interconnectsai

6 May 2025

What people get wrong about the leading Chinese open models: Adoption and censorship Narrative violations on licenses, adoption, and censorship. interconnects.ai/p/what-peop…

What people get wrong about the leading Chinese open models: Adoption and censorship

Narrative violations on licenses, adoption, and censorship.

interconnects.ai

21,445

Interconnects · Apr 30, 2024 · 12:47 AM UTC

Interconnects

@interconnectsai

30 Apr 2024

Phi 3 and Arctic: Outlier LMs are hints Models that seem totally out of scope from recent open LLMs give us a sneak peek of where the industry will be in 6 to 18 months. interconnects.ai/p/phi-3-and…

Phi 3 and Arctic: Outlier LMs are hints

Models that seem totally out of scope from recent open LLMs give us a sneak peek of where the industry will be in 6 to 18 months.

interconnects.ai

9,411

Interconnects · Jul 3, 2024 · 3:54 PM UTC

Interconnects

@interconnectsai

3 Jul 2024

Switched to Claude 3.5 Speculations on the role of RLHF and why I love the model for people who pay attention. interconnects.ai/p/switched-…

Switched to Claude 3.5

Speculations on the role of RLHF and why I love the model for people who pay attention.

interconnects.ai

8,707

Interconnects · Mar 13, 2024 · 3:09 PM UTC

Interconnects

@interconnectsai

13 Mar 2024

Model commoditization and product moats Where moats are tested now that so many people have trained GPT4 class models. Claude 3, Gemini 1.5, Inflection 2.5, and Mistral Large are here to party. interconnects.ai/p/gpt4-comm…

Model commoditization and product moats

Where moats are tested now that so many people have trained GPT4 class models. Claude 3, Gemini 1.5, Inflection 2.5, and Mistral Large are here to party.

interconnects.ai

14,213

Interconnects · Apr 21, 2025 · 4:41 PM UTC

Interconnects

@interconnectsai

21 Apr 2025

The latest open artifacts (#9): RLHF book draft, where the open reasoning race is going, and unsung heroes of open LM work Artifacts Log 9. interconnects.ai/p/the-lates…

The latest open artifacts (#9): RLHF book draft, where the open reasoning race is going, and unsung...

Artifacts Log 9.

interconnects.ai

29,535

Interconnects · Mar 20, 2024 · 6:40 PM UTC

Interconnects

@interconnectsai

20 Mar 2024

Evaluations: Trust, performance, and price (bonus, announcing RewardBench) Evaluation is not only getting harder with modern LLMs getting more complicated, it’s getting harder because it means something different. interconnects.ai/p/evaluatio…

Evaluations: Trust, performance, and price (bonus, announcing RewardBench)

Evaluation is not only getting harder with modern LLMs, it’s getting harder because it means something different.

interconnects.ai

5,805

Interconnects · Dec 6, 2023 · 3:56 PM UTC

Interconnects

@interconnectsai

6 Dec 2023

The DPO debate: Do we need RL for RLHF? Direct vs. RL methods for preferences, more RLHF models, and hard truths in open RLHF work. We have more questions than answers. interconnects.ai/p/the-dpo-d…

Do we need RL for RLHF?

Direct (DPO) vs. RL methods for preferences, more RLHF models, and hard truths in open RLHF work. We have more questions than answers.

interconnects.ai

7,151

Interconnects · May 29, 2024 · 3:09 PM UTC

Interconnects

@interconnectsai

29 May 2024

We aren’t running out of training data, we are running out of open training data Data licensing deals, scaling, human inputs, and repeating trends in open vs. closed LLMs. interconnects.ai/p/the-data-…

We aren’t running out of training data, we are running out of open training data

Data licensing deals, scaling, human inputs, and repeating trends in open vs. closed LLMs.

interconnects.ai

6,899

Interconnects · Jun 12, 2024 · 2:42 PM UTC

Interconnects

@interconnectsai

12 Jun 2024

AI for the rest of us Apple Intelligence makes a lot of sense when you get out of the AI bubble. Plus, the cool technical details Apple shared about their language models "thinking different." interconnects.ai/p/apple-int…

AI for the rest of us

Apple Intelligence makes a lot of sense when you get out of the AI bubble. Plus, the cool technical details Apple shared about their language models "thinking different."

interconnects.ai

13,976

Interconnects · Apr 14, 2025 · 7:30 PM UTC

Interconnects

@interconnectsai

14 Apr 2025

OpenAI's GPT-4.1 and separating the API from ChatGPT OpenAI's latest models optimizing on intelligence per dollar. We'll continue to see ChatGPT handled differently than the API business. interconnects.ai/p/openais-g…

OpenAI's GPT-4.1 and separating the API from ChatGPT

OpenAI's latest models optimizing on intelligence per dollar. We'll continue to see ChatGPT handled differently than the API business.

interconnects.ai

3,679

Interconnects · Apr 30, 2025 · 1:38 PM UTC

Interconnects

@interconnectsai

30 Apr 2025

Brakes on an intelligence explosion Why I don't think AI 2027 is going to come true. interconnects.ai/p/brakes-on…

State of play of AI progress (and related brakes on an intelligence explosion)

Why I don't think AI 2027 is going to come true.

interconnects.ai

14,298

Interconnects · Jan 27, 2025 · 3:38 PM UTC

Interconnects

@interconnectsai

27 Jan 2025

The latest open artifacts (#6): Reasoning models, China's lead in open-source, and a growing multimodal space Artifacts Log 6. The open LM ecosystem yet again accelerates. interconnects.ai/p/open-arti…

The latest open artifacts (#6): Reasoning models, China's lead in open-source, and a growing...

Artifacts log 6 The open LM ecosystem yet again accelerates.

interconnects.ai

7,585

Interconnects · Oct 9, 2024 · 3:16 PM UTC

Interconnects

@interconnectsai

9 Oct 2024

How scaling changes model behavior Some trends are reasonable to extrapolate, some are not. Even for the trends we are succeeding at extrapolating, it is not clear how that signal translates into different AI behaviors. interconnects.ai/p/how-scali…

How scaling changes model behavior

Some trends are reasonable to extrapolate, some are not. Even for the trends we are succeeding at extrapolating, it is not clear how that signal translates into different AI behaviors.

interconnects.ai

15,901

Interconnects · Feb 5, 2025 · 3:29 PM UTC

Interconnects

@interconnectsai

5 Feb 2025

Making the U.S. the home for open-source AI Open-source AI is here to stay, but it is not a given that it will be American. interconnects.ai/p/making-th…

Making the U.S. the home for open-source AI

Open-source AI is here to stay, but it is not a given that it will be American.

interconnects.ai

11,251

Interconnects · Dec 11, 2023 · 8:15 PM UTC

Interconnects

@interconnectsai

11 Dec 2023

Mixtral Round-up: MoE trade-offs, release lessons, Mistral raises $400mil, Google's loss, vibes vs marketing Emergency blog 🚨 We have an amazing open mixture of experts model for the holidays! interconnects.ai/p/mixtral

Mixtral: The best open model, MoE trade-offs, release lessons, Mistral raises $400mil, Google's...

We have an amazing open mixture of experts model for the holidays!

interconnects.ai

10,511

Interconnects · Apr 17, 2024 · 6:19 PM UTC

Interconnects

@interconnectsai

17 Apr 2024

We don’t need to reinvent everything to solve alignment Integrating some non-computing science into reinforcement learning from human feedback (RLHF) can give us the models we want. Bonus: OLMo 1.7-7B. interconnects.ai/p/reinventi…

Stop "reinventing" everything to solve alignment

Integrating some non-computing science into reinforcement learning from human feedback (RLHF) can give us the models we want.

interconnects.ai

5,363

Interconnects · Jan 3, 2024 · 3:47 PM UTC

Interconnects

@interconnectsai

3 Jan 2024

It's 2024 and they just want to learn The state of the ML communities big and small starting 2024. My general expectations for the year. interconnects.ai/p/they-want…

It's 2024 and they just want to learn

The state of the ML communities big and small starting 2024. My general expectations for the year.

interconnects.ai

9,668

Interconnects · Jun 26, 2025 · 2:24 PM UTC

Interconnects

@interconnectsai

26 Jun 2025

Latest open artifacts (#11): Visualizing China's open models market share, Arcee's models, and VLAs for robotics Artifacts Log 11. interconnects.ai/p/latest-op…

Latest open artifacts (#11): Visualizing China's open models market share, Arcee's models, and VLAs...

Artifacts Log 11.

interconnects.ai

11,423

Interconnects · Aug 1, 2024 · 3:05 PM UTC

Interconnects

@interconnectsai

1 Aug 2024

Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education Interconnects interviews #4. interconnects.ai/p/interview…

Interviewing Sebastian Raschka on the state of open LLMs, Llama 3.1, and AI education

Interconnects interviews #4.

interconnects.ai

19,458

Interconnects · Dec 5, 2024 · 3:37 PM UTC

Interconnects

@interconnectsai

5 Dec 2024

Interviewing Finbarr Timbers on the "We are So Back" Era of Reinforcement Learning Interconnects interview #11. An overview on the past, present, and future of RL. interconnects.ai/p/finbarr-t…

Interviewing Finbarr Timbers on the "We are So Back" Era of Reinforcement Learning

Listen now | Interconnects interview #11. An overview on the past, present, and future of RL.

interconnects.ai

5,733

Interconnects · Aug 11, 2025 · 6:17 PM UTC

Interconnects

@interconnectsai

11 Aug 2025

This month's leading open model contributors in Artifacts Log. Thanks for continuing to release your work. Qwen (@Alibaba_Qwen) x5 Zhipu AI (@ZhipuAI) x2 NVIDIA (@nvidia) x2 OpenAI (@OpenAI) InclusionAI (@InclusionAI666) x2 Infinigence (@infinigenceAI) Tesslate (@TesslateAI) Arcee (@arcee_ai) DeepCogito (@DeepCogito) Kakao (@kakaocorpglobal) Skywork (@Skywork_ai) Tencent (@TencentGlobal) x2 InternLM (@intern_lm) StepFun (@StepFun_ai) SK Telecom (@SKtelecom) Xiaomi MiMo (@Xiaomi) OpenBMB (@OpenBMB) Quotient AI (@QuotientAI) Roblox (@Roblox) Knowledgator (@knowledgator) Cisco Foundation AI (@fdtn_ai) ByteDance Seed (@bytedance_talk) JHU CLSP (@jhuclsp) VAGO Solutions (@VAGOsolutions) NuMind (@numind_ai) Neta (@NetaArt_AI) Black Forest Labs (@bfl_ml) Numina (@ProjectNumina) Krea AI (@krea_ai) Hugging Face M4 (@huggingface) moondream (@moondreamai) SpatialVerse (@spatialverse) RedNote HiLab x2 KwaiPilot PowerInfer MetaStoneTec Trillion Labs IBM Granite ScienceOne AI kpsss34 X-Omni Tencent BAC MiSpeech Wan AI The Common Pile

Interconnects

@interconnectsai

11 Aug 2025

Latest open artifacts (#13): The abundance era of open models Mostly thanks to Qwen, but now we're spoiled for choice and winds are shifting. interconnects.ai/p/latest-op…

1,673

Interconnects · Oct 11, 2023 · 2:39 PM UTC

Interconnects

@interconnectsai

11 Oct 2023

The AI research job market shit show (and my experience) There are plenty of jobs, but finding a place where you're happy is as hard as ever. Read here: interconnects.ai/p/ai-resear…

The AI research job market shit show (and my experience)

There are plenty of jobs, but finding a place where you're happy is as hard as ever.

interconnects.ai

1,576

Interconnects · Apr 28, 2025 · 2:12 PM UTC

Interconnects

@interconnectsai

28 Apr 2025

Transparency and (shifting) priority stacks What you want to be open says a lot about your ranked priorities. interconnects.ai/p/transpare…

Transparency and (shifting) priority stacks

What you want to be open says a lot about your ranked priorities.

interconnects.ai

11,087

Interconnects · Jul 17, 2024 · 2:18 PM UTC

Interconnects

@interconnectsai

17 Jul 2024

SB 1047, AI regulation, and unlikely allies for open models The rallying of the open-source community against CA SB 1047 can represent a turning point for AI regulation. interconnects.ai/p/sb-1047-a…

SB 1047, AI regulation, and unlikely allies for open models

The rallying of the open-source community against CA SB 1047 can represent a turning point for AI regulation.

interconnects.ai

2,960

Interconnects · Aug 2, 2023 · 3:04 PM UTC

Interconnects

@interconnectsai

2 Aug 2023

Specifying objectives in RLHF: the links between the scientific weirdness of RLHF, DPO, and @johnschulman2's ICML talk on proxy objectives. interconnects.ai/p/specifyin…

Specifying objectives in RLHF

At ICML, it is obvious that many people are getting value out of RLHF. What is limiting the scientific understanding of it (other than research embargoes)?

interconnects.ai

1,728