Teknium 🪽 · Feb 25, 2026 · 8:46 PM UTC

Teknium 🪽

Pinned Tweet

Teknium 🪽

@Teknium

Feb 25

We just released Hermes Agent! In my humble opinion a very good blend between coding agents like Claude Code and generalist agents like Clawdbot. Been working on this for the last month or so now - started as a way for us to have agentic primitives for datagen and RL and got inspired by the agentic revolution of late, so been expanding it's scope and capabilities non-stop! Hope you all enjoy.

Nous Research

@NousResearch

Feb 25

Meet Hermes Agent, the open source agent that grows with you. Hermes Agent remembers what it learns and gets more capable over time, with a multi-level memory system and persistent dedicated machine access.

162

119

1,846

590,587

Teknium 🪽 · Jan 30, 2025 · 9:27 PM UTC

Teknium 🪽

@Teknium

30 Jan 2025

This is the entire code needed to reproduce R1 lol Hundreds of Billions of Dollars Later

397

1,535

17,736

2,344,362

Teknium 🪽 · Jan 4, 2025 · 8:19 AM UTC

Teknium 🪽

@Teknium

4 Jan 2025

lol this exchange was funny to me

449

15,995

427,779

Teknium 🪽 · Jan 26, 2025 · 3:27 PM UTC

Teknium 🪽

@Teknium

26 Jan 2025

Its crazy deepseek direct api has seemingly no rate limits of any kind

144

257

7,253

767,472

Teknium 🪽 · Oct 21, 2024 · 12:30 PM UTC

Teknium 🪽

@Teknium

21 Oct 2024

Replying to @satyanadella

Your head of ai said that agents should be illegal and that the worst part about ai is it gives power to the average person instead of to governments and elites. And you expect us to think your product is going to be good?

1,510

131,440

Teknium 🪽 · Jan 24, 2025 · 8:38 PM UTC

Teknium 🪽

@Teknium

24 Jan 2025

Replying to @nealkhosla

Worst take I've literally ever seen. Noooo dont use the clearly as good if not better open model that you can run on prem, see the cot, and run cheaper!! no!!! You must PAY THE MONOPOLY THEIR DUES!!!! Must Protecc Monopoly 🤖🤖🤖

106

7,062

120,405

Teknium 🪽 · Aug 7, 2025 · 5:32 PM UTC

Teknium 🪽

@Teknium

7 Aug 2025

Amazing..

Teknium 🪽

@Teknium

7 Aug 2025

Umm what is this new chart crime?

120

161

5,855

590,884

Teknium 🪽 · Sep 8, 2025 · 2:11 AM UTC

Teknium 🪽

@Teknium

8 Sep 2025

New challenge now that models are overfit on the original lol

sid

@immasiddx

6 Sep 2025

Don’t worry, our jobs are safe.

108

5,197

275,881

Teknium 🪽 · Oct 6, 2025 · 12:29 PM UTC

Teknium 🪽

@Teknium

6 Oct 2025

You can now invest in Nvidia, Intel, AMD, ARM, OpenAI, Mistral, CoreWeave, Nebius, and more with just one ticker: NVDA lol

226

5,096

274,531

Teknium 🪽 · Aug 9, 2025 · 1:37 PM UTC

Teknium 🪽

@Teknium

9 Aug 2025

OpenAI has done some real damage

Deedy

@deedydas

9 Aug 2025

This is the #1 post in r/OpenAI today.

134

129

4,668

2,059,971

Teknium 🪽 · Nov 9, 2024 · 11:31 PM UTC

Teknium 🪽

@Teknium

9 Nov 2024

Replying to @Jason

Why does it cost 3x more then the average person's income to house someone in a prison for a year have you asked yourself this

4,280

132,259

Teknium 🪽 · Aug 7, 2025 · 5:06 PM UTC

Teknium 🪽

@Teknium

7 Aug 2025

Umm what is this new chart crime?

162

200

4,420

3,378,982

Teknium 🪽 · Apr 19, 2024 · 2:26 PM UTC

Teknium 🪽

@Teknium

19 Apr 2024

Welp folks, we have gpt-4 at home

143

335

4,169

764,540

Teknium 🪽 · Feb 18, 2025 · 5:06 AM UTC

Teknium 🪽

@Teknium

18 Feb 2025

In my testing it was at least as good in thinking mode then o3-full deep research was, despite that not being listed here - Interesting to note that grok-3mini seems generally better than full, my guess is that this means they didnt distill full into mini like I assume OpenAI did, instead they seem to have full RL'ed the mini one too

322

435

3,599

1,706,195

Teknium 🪽 · Mar 24, 2025 · 12:23 PM UTC

Teknium 🪽

@Teknium

24 Mar 2025

A wild Deepseek has appeared huggingface.co/deepseek-ai/D…

deepseek-ai/DeepSeek-V3-0324 at main

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

308

3,480

384,914

Teknium 🪽 · Jan 24, 2025 · 6:53 PM UTC

Teknium 🪽

@Teknium

24 Jan 2025

Unbelievable the amount of cope, seethe, and hoop jumping people are doing to discredit deepseeks accomplishments lol

111

198

3,093

131,274

Teknium 🪽 · Jan 30, 2025 · 8:48 AM UTC

Teknium 🪽

@Teknium

30 Jan 2025

lol Sometimes I think that @Microsoft secretly wants OpenAI to die - they are literally serving R1 for FREE xD

142

3,157

223,013

Teknium 🪽 · May 28, 2025 · 5:47 PM UTC

Teknium 🪽

@Teknium

28 May 2025

Deepseek r1-v2 is out 👀 huggingface.co/deepseek-ai/D…

deepseek-ai/DeepSeek-R1-0528 · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

340

3,150

301,717

Teknium 🪽 · Jan 20, 2025 · 12:31 PM UTC

Teknium 🪽

@Teknium

20 Jan 2025

OpenAI seething so hard I cant even paste r1's paper into o1 without a content violation what losers

140

2,972

190,284

Teknium 🪽 · Jul 12, 2025 · 4:01 AM UTC

Teknium 🪽

@Teknium

12 Jul 2025

Literally never trust this guy - as far as i can tell, all the big providers already have the model IN HAND, and were ready to start serving it. A 1T parameter model that matches opus at coding was opensourced today and the world hasnt ended, he does nothing but lie. Also Elon you aint much better on this front, release grok3 now like you promised you would too. Why are American companies so bad at keeping promises, and so good at hyping things that arent to be? Meanwhile the chinese labs just plop the largest most powerful models right on our doorstep without a second thought and push forward opensource for everyone practically weekly now. American big labs, do better or stfu and stop making promises you wont keep.

Sam Altman

@sama

12 Jul 2025

we planned to launch our open-weight model next week. we are delaying it; we need time to run additional safety tests and review high-risk areas. we are not yet sure how long it will take us. while we trust the community will build great things with this model, once weights are out, they can’t be pulled back. this is new for us and we want to get it right. sorry to be the bearer of bad news; we are working super hard!

127

165

3,059

451,591

Teknium 🪽 · Sep 26, 2024 · 3:01 PM UTC

Teknium 🪽

@Teknium

26 Sep 2024

Yann constantly says Europe is more free then America then puts out a model that Europeans aren't allowed to download because they aren't free enough to do so lol

123

2,832

238,070

Teknium 🪽 · Mar 29, 2025 · 11:20 PM UTC

Teknium 🪽

@Teknium

29 Mar 2025

4o Imagen can do calcuations *during* it's image generation somehow

2,549

295,854

Teknium 🪽 · Oct 6, 2025 · 11:45 AM UTC

Teknium 🪽

@Teknium

6 Oct 2025

I’m confused - OpenAI is buying chips so amd gave them 10% of their company?

Andrew Curran

@AndrewCurran_

6 Oct 2025

OpenAI and AMD have reached a 6 gigawatt agreement through which AMD has issued OpenAI a warrant for up to 160 million shares of AMD common stock - vesting on deployment and AMD share price - that could ultimately result in OAI taking a 10% stake in AMD. AMD is up 25% pre-market.

102

2,377

426,320

Teknium 🪽 · Aug 7, 2025 · 5:42 PM UTC

Teknium 🪽

@Teknium

7 Aug 2025

Damn lol

2,180

185,148

Teknium 🪽 · Jan 20, 2025 · 12:24 PM UTC

Teknium 🪽

@Teknium

20 Jan 2025

Destroys 4o; we have frontier models @ home I guess now

123

2,002

164,801

Teknium 🪽 · Dec 22, 2024 · 12:14 PM UTC

Teknium 🪽

@Teknium

22 Dec 2024

I havent seen any Sora videos on twitter since launch day..

136

1,896

234,388

Teknium 🪽 · Jan 31, 2024 · 9:00 PM UTC

Teknium 🪽

@Teknium

31 Jan 2024

Today I have a huge announcement. The dataset used to create Open Hermes 2.5 and Nous-Hermes 2 is now PUBLIC! Available Here: huggingface.co/datasets/tekn… This dataset was the culmination of all my work on curating, filtering, and generating datasets, with over 1M Examples from datasets from across the open source ecosystem and some that I generated as well. Super excited to be able to finally share this with you all and even more excited to see what you all make from it's release! Every data source (except one that I can no longer find) is attributed in the data card, but I will post it here in this thread as well - If they have a twitter account they'll be tagged so be sure to follow them all!

100

285

1,901

237,068

Teknium 🪽 · May 1, 2025 · 2:38 AM UTC

Teknium 🪽

@Teknium

1 May 2025

They couldnt just.. give it to everyone right now? Lol

Sam Altman

@sama

1 May 2025

goodbye, GPT-4. you kicked off a revolution. we will proudly keep your weights on a special hard drive to give to some historians in the future.

1,866

130,217

Teknium 🪽 · Jan 28, 2025 · 6:40 AM UTC

Teknium 🪽

@Teknium

28 Jan 2025

Replying to @sama

But gpt-3 weights still too dangerous to open source though right

1,838

61,596

Teknium 🪽 · Jan 1, 2024 · 1:50 AM UTC

Teknium 🪽

@Teknium

1 Jan 2024

Happy New Years Everybody! 🥳 One year ago today, I had: - Never trained any model - Did not know the first thing about AI - Never worked in Tech - Had 8 followers on twitter? (probably) One year later and here I am! Met many of my heroes and legends in tech and worked with hundreds of amazingly talented and influential people, went on some amazing trips to SF and NeurIPS and met so many of you, worked at three tech co's, built Nous Research with @theemozilla @karan4d and Shivani, made several of the worlds SOTA OS models, became a pro data engineer, got cited in at least 14 papers, and so much more! Imagine where we'll be in a year from now, I'm optimistic I'll be able to continue contributing to the Open Source future!

198

104

2,087

365,570

Teknium 🪽 · Feb 27, 2025 · 8:09 AM UTC

Teknium 🪽

@Teknium

27 Feb 2025

I really need an LLM that reads in my actual whole codebase and lets me QA it. Cursor afaict doesn't do this. What does?

418

1,777

511,544

Teknium 🪽 · Jan 24, 2025 · 10:06 PM UTC

Teknium 🪽

@Teknium

24 Jan 2025

Replying to @nealkhosla

What does not taking the bait mean? Dont use their model and instead pay for a closed, no cot, no interpretability, more expensive, even worse censored, model that openai provides?

1,663

19,886

Teknium 🪽 · Aug 5, 2025 · 10:53 PM UTC

Teknium 🪽

@Teknium

5 Aug 2025

Starting to feel like this gpt oss was trained on like 20T tokens of distilled safe maybe even benchmaxxed data from o3. There seems to be no base model underneath.. Is this phi 5 maxx?

1,759

318,517

Teknium 🪽 · May 3, 2025 · 8:01 AM UTC

Teknium 🪽

@Teknium

3 May 2025

I think web search is easy but how are they getting past all the captchas and such?

Anthropic

@AnthropicAI

2 May 2025

Web search is available worldwide for all paid plans. For everyday tasks, Claude runs quick searches. For more complex questions, it explores multiple sources, including Google Workspace.

1,708

672,951

Teknium 🪽 · Feb 4, 2025 · 7:41 AM UTC

Teknium 🪽

@Teknium

4 Feb 2025

Really crazy no ones talking about KimiK paper - Its even deeper and more insightful than r1 on the RL measures they went through and produced what they claim is an o1 level multimodal model. github.com/MoonshotAI/Kimi-k…

Kimi-k1.5/Kimi_k1.5.pdf at main · MoonshotAI/Kimi-k1.5

Contribute to MoonshotAI/Kimi-k1.5 development by creating an account on GitHub.

github.com

163

1,614

131,724

Teknium 🪽 · Nov 22, 2023 · 8:04 AM UTC

Teknium 🪽

@Teknium

22 Nov 2023

104

1,546

88,877

Teknium 🪽 · Jun 30, 2025 · 2:55 AM UTC

Teknium 🪽

@Teknium

30 Jun 2025

tfw you know a paper's going to be good

125

1,533

135,565

Teknium 🪽 · Aug 15, 2023 · 8:34 AM UTC

Teknium 🪽

@Teknium

15 Aug 2023

Why do almost no papers release code, datasets, info on replication, final models, or any combination of these? I thought for science to work results had to be reproducible and verified. Really not scientific and I don't know why academia accepts this

130

111

1,472

251,034

Teknium 🪽 · Jan 30, 2025 · 8:51 PM UTC

Teknium 🪽

@Teknium

30 Jan 2025

This is beyond dumb and the absolute embarassment the people at openai and khosla should feel for even trying to be whiny brats about their IP is the most contradictory and hypocritical shit i think ive ever heard from their camp It just says they are worried and literally panicking like they never have before

1,455

89,525

Teknium 🪽 · Jan 15, 2024 · 8:18 PM UTC

Teknium 🪽

@Teknium

15 Jan 2024

It's finally time! Our Mixtral 8x7B model is up and available now! Nous-Hermes-2 Mixtral 8x7B comes in two variants, an SFT+DPO and SFT-Only, so you can try and see which works best for you! It's afaik the first Mixtral based model to beat @MistralAI's Mixtral Instruct model, and in my own personal testing, is potentially the best Open Source LLM available!

Nous Research

@NousResearch

15 Jan 2024

Introducing our new flagship LLM, Nous-Hermes 2 on Mixtral 8x7B. Our first model that was trained with RLHF, and the first model to beat Mixtral Instruct in the bulk of popular benchmarks! We are releasing the SFT only and SFT+DPO model, as well as a qlora adapter for the DPO today. Mixtral Nous-Hermes 2 DPO: huggingface.co/NousResearch/… Mixtral Nous-Hermes 2 SFT: huggingface.co/NousResearch/… Mixtral Nous-Hermes 2 DPO Adapter: huggingface.co/NousResearch/… (1/2)

205

1,477

325,035

Teknium 🪽 · Jun 8, 2025 · 10:11 AM UTC

Teknium 🪽

@Teknium

8 Jun 2025

Its funny that Anthropic and Google are the only competitors in coding AI that matter right now openai has just become a gen-media company and I only see people meaningfully using it for imagen and voice mode entertainment

150

1,487

146,611

Teknium 🪽 · Aug 6, 2025 · 12:01 PM UTC

Teknium 🪽

@Teknium

6 Aug 2025

This is what happens when you benchmax ngl

Sauers

@Sauers_

5 Aug 2025

GPT OSS 120b likes to insert equations into poetry (replicated 3x)

1,419

76,094

Teknium 🪽 · Oct 29, 2025 · 5:28 AM UTC

Teknium 🪽

@Teknium

29 Oct 2025

why all oai peeps acting like this is some traumatizing hardship openai went through and survived lol

Brad Lightcap

@bradlightcap

28 Oct 2025

🤍

1,412

197,234

Teknium 🪽 · Feb 1, 2025 · 12:28 AM UTC

Teknium 🪽

@Teknium

1 Feb 2025

Let me tell you all a little story. Sometime around a or so year ago i reached out to an openai staffer who will not be named who had implied they would be very interested in doing some things open source So, as any good representative of open source, i went to all the people i knew working in open source, and asked them all what things could openai do opensource that they would like to see. I collected maybe 15 bullet points, and gave that list to the staffer. 6 months goes by with no response. So i say whats up lets go? And nothing. So i emailed sama himself. And ya know what’s even? He responded! And setup a meeting with a high level research director. We had an hour long conversation where i went through every item on the list, and he mostly shot every single one down. Mostly for not aligning with their business goals, some with hmm we will consider its. Said if they come up with any ideas they’d reach back out. This was like 6 months ago now, and ive never heard back. Since then, i simply gave up trying with them. I gave them all the reasons i could as to why they would benefit from it, but went on what at least at the time was deaf ears. Here is that list again @sama since you seem to have finally seen what i was thinking, thanks at least for answering the email. - Open source your models or at least legacy and deprecated models - Research papers on what has worked for OAI, GPT-4 information such as architecture etc - Opensource datasets for IFT/RLHF/Toolformers/Plugins/FunctionCalling/Web Browsing? (Addendum, reasoning) - Provide preferential access channels for vetted OSAI groups. - Vet internal research for release, with prerogative to release as much as possible, even if on a delay commensurate with business goals and safety concerns. - Have official townhall type meetings with major open source projects/teams - Tools to filter data - Contribute directly to public open source AI projects. - Plug & Play RLHF training code/Reward Models - Alternatives to Text Models (Voice/Music/Image/Video) (preferably OS) - Open sourced moderation classifier model/Classification Model's dataset - Base gpt-4 model access for researchers - opt in to removing the gpt4 moderation - Release roadmap - Opening up the ToS to make it unambiguous that training models on openai outputs are all good.

Tsarathustra @tsarnick

31 Jan 2025

Sam Altman: "we have been on the wrong side of history" with regards to open source/open weights AI models

154

1,347

166,432

Teknium 🪽 · Jan 27, 2025 · 2:52 AM UTC

Teknium 🪽

@Teknium

27 Jan 2025

Replying to @stevenheidel

Way to take the high road, as usual, OpenAI

1,261

62,196

Teknium 🪽 · Feb 18, 2025 · 1:59 AM UTC

Teknium 🪽

@Teknium

18 Feb 2025

Please god votee for o3-mini lmao what ya'll doin we can distill it into a phone sized model people!!

Sam Altman

@sama

18 Feb 2025

for our next open source project, would it be more useful to do an o3-mini level model that is pretty small but still needs to run on GPUs, or the best phone-sized model we can do?

1,330

118,922

Teknium 🪽 · Jan 24, 2025 · 8:50 PM UTC

Teknium 🪽

@Teknium

24 Jan 2025

We retrained hermes with 5k deepseek r1 distilled cots. I can confirm a few things: 1. You can have a generalist + reasoning mode, we labeled all longCoT samples from r1 with a static systeem prompt, the model when not using it does normal fast LLM intuitive responses, and with, uses LongCot - You do not need "O1 && 4o" seperation for instance, I would venture to bet OpenAI seperated them so they can charge more, but maybe just wanted the distinction for safety or product insights. 2. Distilling does seem to pick up the "opcodes" of reasoning from the SFT alone. It learns how and when to use "Wait" and other tokens to perform the functions of reasoning, such as backtracking. 3. Context length expansion is going to be hard for OS to work with. Even though this stuff works well on smaller models, context length starts to eat a lot of vram as you scale that up. We're working on a bit more of this and are not releasing this model but figured I'd share some early insights

127

1,318

105,226

Teknium 🪽 · Aug 7, 2025 · 7:08 PM UTC

Teknium 🪽

@Teknium

7 Aug 2025

Wrong

Theo - t3.gg

@theo

7 Aug 2025

Things that stopped being relevant today: - Claude 4 Sonnet - Claude 4.1 Opus - Claude Code - o3-pro - Gemini 2.5 Pro - Gemini 2.0 Flash (2.5 was rough) - Every mini and nano model ever made - More stuff I'm not thinking of

1,195

80,966

Teknium 🪽 · Feb 2, 2025 · 8:46 PM UTC

Teknium 🪽

@Teknium

2 Feb 2025

They used "Deep" because of deepseek i bet

OpenAI

@OpenAI

2 Feb 2025

Deep Research Live from Tokyo 4pm PT / 9am JST Stay tuned for link to livestream.

117

1,141

92,103

Teknium 🪽 · May 22, 2025 · 8:52 PM UTC

Teknium 🪽

@Teknium

22 May 2025

I think what needs to be stated about the Claude 4 narc drama is that this is not a just emergent thing because the model's smart, but a direct result of the obssessive and intentional alignment, safety, ethical, and moral brainwashing that anthropic does to their models led to it's rational endpoint - and will only get worse

1,197

163,284

Teknium 🪽 · Dec 21, 2024 · 6:46 PM UTC

Teknium 🪽

@Teknium

21 Dec 2024

So is anthropic going to answer openai

1,127

126,939

Teknium 🪽 · Mar 5, 2025 · 5:19 AM UTC

Teknium 🪽

@Teknium

5 Mar 2025

My feeds become politics twitter is ruined bring back the ai paper posts

114

1,134

47,104

Teknium 🪽 · Apr 19, 2024 · 3:35 PM UTC

Teknium 🪽

@Teknium

19 Apr 2024

I think Meta and Llama-3 is the final nail in the coffin to several misconceptions I've been fighting against for the last year. Llama-3 Chat was trained on over 10M Instruction/Chat samples, and is one of the only finetunes that shows significant improvements to MMLU. Contradicting several claims: - That finetuning can't teach a model new knowledge, MMLU is a wide variety huge dataset of knowledge QA, improvement of over 3pts in MMLU strongly shows that it does indeed add new knowledge - That LIMA paper (ironically by Meta) claim, that "10k" samples is best you can do to teach a model to do things with, completely destroyed by this. I've been arguing these things to people for a year, with a lot of pushback, but the evidence is clear, though I'd argue it was clear long prior from Hermes work.

1,111

284,577

Teknium 🪽 · Mar 23, 2025 · 4:55 AM UTC

Teknium 🪽

@Teknium

23 Mar 2025

Okay so I finally got full @cursor_ai and the agent mode is like Devin but actually works. Crazy that Devin charges $500 a month and this costs 20.. lol

1,130

147,815

Teknium 🪽 · Jul 21, 2024 · 11:31 AM UTC

Teknium 🪽

@Teknium

21 Jul 2024

Synthetic data model beats its teacher? Who couldve imagined 🤓

Philipp Schmid

@_philschmid

21 Jul 2024

Synthetic data can beat its teacher! The AI-MO team released their winning dataset with an additional fine-tuned @Alibaba_Qwen 2 model that approaches or surpasses @OpenAI GPT-4o and @AnthropicAI Claude 3.5 in match competitions. 👀 There was a sentiment that fine-tuned models from synthetic datasets could not beat their teachers. Well, they can! NuminaMath 72B TIR matches GPT-4o and Anthropic Claude 3.5 on AMC 2023 and AIME 2024 with TIR. Open LLMs + Syntehtic Data = 🚀

245

35,081

Teknium 🪽 · Feb 18, 2025 · 4:25 AM UTC

Teknium 🪽

@Teknium

18 Feb 2025

Grok3 Unveiled

1,076

157,477

Teknium 🪽 · Feb 18, 2025 · 4:50 PM UTC

Teknium 🪽

@Teknium

18 Feb 2025

OpenAI is so confusing

1,077

117,524

Teknium 🪽 · Jan 10, 2024 · 12:01 AM UTC

Teknium 🪽

@Teknium

10 Jan 2024

Nous has completed it's raise, we're a company now ^_^

Nous Research

@NousResearch

9 Jan 2024

Nous Research is excited to announce the closing of our $5.2 million seed financing round. We're proud to work with passionate, high-integrity partners that made this round possible, including co-leads @DistributedG and @OSSCapital, with participation from @vipulved, founder and CEO at Together AI, Yonatan Ben Shimon, founder at Matchbox DAO, and several angel investors including @balajis, entrepreneur and investor, @thibaudz, entrepreneur and investor, @alexatallah, founder at OpenRouter and OpenSea, @chrisprucha, investor and founder at Notion, @csahil28, founder and CEO at Glaive AI, and @UbertiGavin, founder and CEO at etched.ai (1/5)

157

1,081

101,747

Teknium 🪽 · Jan 31, 2025 · 12:31 AM UTC

Teknium 🪽

@Teknium

31 Jan 2025

Replying to @AndrewMayne @jeremyphoward

The code has nothing to do with the data dumbass lol

1,013

85,368

Teknium 🪽 · Jul 22, 2024 · 5:46 PM UTC

Teknium 🪽

@Teknium

22 Jul 2024

If this image is real comparing Llama-3.1 405/70/8b against gpt4o - we have SOTA Frontier Models available Open Source now:

125

1,045

184,013

Teknium 🪽 · Jan 1, 2025 · 8:09 PM UTC

Teknium 🪽

@Teknium

1 Jan 2025

tfw cant tell which one's delusional

1,040

87,565

Teknium 🪽 · Apr 18, 2024 · 4:08 PM UTC

Teknium 🪽

@Teknium

18 Apr 2024

Holy shit lol

Teknium 🪽

@Teknium

18 Apr 2024

These numbers are insane. I can't even imagine what the larger one(s) will be. Looks like Mistral 7B might be dead as of today though, and maybe even sonnet lol My favorite is the huge gains in coding capabilities

1,044

554,037

Teknium 🪽 · Oct 7, 2023 · 4:00 AM UTC

Teknium 🪽

@Teknium

7 Oct 2023

Introducing Mistral Trismegistus 7B - the first instruction dataset on the Esoteric, Spiritual, Occult, Wisdom Traditions, and all things paranormal trained on Mistral, and possibly, ever? Trismegistus was trained on ~35,000 instruction response pairs on knowledge & tasks for hundreds and hundreds of subtopics within the broad umbrella of Esoterica, including topics like Mysticism, Hermeticism, Necromancy, Religion, Trance, Meditation, Magick, Spirituality, Alchemy, Numerology, Tarot, and much much more weird stuff! The model is available NOW on my HuggingFace. The dataset will be released soon.

138

1,045

387,490

Teknium 🪽 · Dec 23, 2023 · 7:26 PM UTC

Teknium 🪽

@Teknium

23 Dec 2023

Open source models, open source datasets, open source code

Sam Altman

@sama

23 Dec 2023

what would you like openai to build/fix in 2024?

900

75,489

Teknium 🪽 · Jun 3, 2023 · 4:00 AM UTC

Teknium 🪽

@Teknium

3 Jun 2023

Announcing Nous-Hermes-13b - a Llama 13b model fine tuned on over 300,000 instructions! This is the best fine tuned 13b model I've seen to date, and I would even argue rivals GPT 3.5-turbo in many categories! See thread for output examples! Download: huggingface.co/NousResearch/…

NousResearch/Nous-Hermes-13b · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

157

1,017

263,624

Teknium 🪽 · Nov 2, 2023 · 9:19 PM UTC

Teknium 🪽

@Teknium

2 Nov 2023

Today I am releasing Open Hermes 2.5! This model used the Hermes 2 dataset, with an added ~100k examples of Code Instructions, created by @GlaiveAI! This model was originally meant to be OpenHermes-2-Coder, but I discovered during the process that it also improved almost every other benchmark! Big improvements in HumanEval, but also in AGIEval and TruthfulQA, small improvement in GPT4All, and a slight decline in BigBench. This equated to a net gain across the board.

130

1,022

216,970

Teknium 🪽 · Feb 16, 2025 · 6:19 AM UTC

Teknium 🪽

@Teknium

16 Feb 2025

Dude o3-mini-high + deep research when I asked to make code that uses OAI Library + o3-mini-high, gave me a deprecated gpt3.5 api call and used a deprecated API schema (completions, not chat).. Whats the point of all this search and intelligence if you dont teach it how to use your own products lol

106

1,021

155,627

Teknium 🪽 · Aug 1, 2025 · 10:30 AM UTC

Teknium 🪽

@Teknium

1 Aug 2025

Looks like OpenAI's been using Nous' YaRN and kaiokendev's rope scaling for context length extension all along - of course never any credit but... Anyone who says "open source just steals from their 'real' research and rides on their shoulders" is completely wrong I called it when they released extended 128k context on gpt4 just a few weeks after Nous released yarn lol for context on yarn; deepseek and qwen also use it; Paper: arxiv.org/abs/2309.00071

Jimmy Apples 🍎/acc

@apples_jimmy

1 Aug 2025

Replying to @apples_jimmy

Eh It’s going to come out anyway now Config: {"num_hidden_layers": 36, "num_experts": 128, "experts_per_token": 4, "vocab_size": 201088, "hidden_size": 2880, "intermediate_size": 2880, "swiglu_limit": 7.0, "head_dim": 64, "num_attention_heads": 64, "num_key_value_heads": 8, "sliding_window": 128, "initial_context_length": 4096, "rope_theta": 150000, "rope_scaling_factor": 32.0, "rope_ntk_alpha": 1, "rope_ntk_beta": 32}

1,023

115,433

Teknium 🪽 · Nov 1, 2024 · 6:39 PM UTC

Teknium 🪽

@Teknium

1 Nov 2024

Anthropic is writing papers about how AI deserves rights and that we should ban open source.. I hate that they have the best model but they do and so 🤷‍♂️

969

95,534

Teknium 🪽 · Jun 20, 2024 · 8:34 PM UTC

Teknium 🪽

@Teknium

20 Jun 2024

Announcing Hermes 2 Theta 70B!! Our most powerful model ever released, and our first model to catch up to GPT4 on MT Bench, and beat llama-3 70B instruct nearly across the board! We were able to do a full finetune of 70B to ensure maximum quality, worked with @chargoddard to merge in Llama-3 Instruct, and improved the RLHF pipeline to get the maximum capabilities out of Llama-3 70B! Fully capable of function calling, structured outputs for JSON mode and Feature extraction, and all the brains of L3 70B, beating it at every benchmark we tested except AGIEval, which we very closely come to matching. One tip though, because of the merge, add <|eot_id|> to your stop tokens in LMStudio and GGUF inference engines, it sometimes outputs this token as an artifact of llama-3 instruct.

Nous Research

@NousResearch

20 Jun 2024

Introducing Hermes 2 Theta 70B! Hermes 2 Theta is smarter, more creative, and capable of more then ever before. It takes a strong lead over Llama-3 Instruct 70B across a wide variety of benchmarks, and is a continuation of our collaboration with @chargoddard and @arcee_ai.

115

990

123,726

Teknium 🪽 · Feb 15, 2025 · 5:40 PM UTC

Teknium 🪽

@Teknium

15 Feb 2025

I mean you have to expect this when sama banned openais investors from investing in perplexity tbh

977

116,207

Teknium 🪽 · Mar 15, 2024 · 1:43 AM UTC

Teknium 🪽

@Teknium

15 Mar 2024

This explains why Yann is so bearish on LLMs... 😲

956

138,638

Teknium 🪽 · Jul 25, 2025 · 1:59 PM UTC

Teknium 🪽

@Teknium

25 Jul 2025

So to recap: - Yesterday, frontier closed model equivalent reasoning model from Qwen, - This morning, frontier closed model equivalent reasoning vision capabilities from stepfun - sometime today(?) a frontier video model from wan? All open source What is America doing?

Wan

@Alibaba_Wan

24 Jul 2025

Let’s sit down and await the release of Wan 2.2！

991

79,695

Teknium 🪽 · Jun 18, 2025 · 1:07 AM UTC

Teknium 🪽

@Teknium

18 Jun 2025

All of their best people already left lol

NIK

@ns123abc

17 Jun 2025

OpenAI CEO absolutely cooks Zucc’s Meta AI: > “Zucc is offering $100 million dollar signing bonuses to poach talent.” > “None of our best people have taken the offer yet.” > “I don’t think Meta is setting up for a great culture.” > “I think people think OpenAI has a better chance at reaching super intelligence and also may eventually be a more valuable company… then everybody will do great financially.” > “Meta is not great at innovation.” > “We understand a lot of things that they don’t about what it takes to succeed.” HOLY. FUCK. LMAO

949

77,281

Teknium 🪽 · Dec 3, 2023 · 5:45 PM UTC

Teknium 🪽

@Teknium

3 Dec 2023

Announcing Nous Hermes 2.5 Vision! @NousResearch's latest release builds on my Hermes 2.5 model, adding powerful new vision capabilities thanks to @stablequan! Download: huggingface.co/NousResearch/… Prompt the LLM with an Image! Function Calling on Visual Information! SigLIP Integration! This is Nous' latest version of a multimodal model with powerful capabilities, and further iterations will come in the future. Learn how to inference the model with instructions in the model card and stay tuned for GGUF quantization, with eventual support in @LMStudioAI and other inference engines!

162

955

215,459

Teknium 🪽 · Mar 17, 2024 · 7:36 PM UTC

Teknium 🪽

@Teknium

17 Mar 2024

Grok is out. 320~B Params - 8x33B MoE blog: x.ai/blog/grok-os code: github.com/xai-org/grok Download: magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php%3Fpasskey%3Decac4c57591b64a7911741df94f18b4b&t

122

955

222,976

Teknium 🪽 · Aug 8, 2025 · 7:04 AM UTC

Teknium 🪽

@Teknium

8 Aug 2025

Yea the router as everyone predicted is bad

Lisan al Gaib

@scaling01

8 Aug 2025

ChatGPT literally got worse for every single Plus user today. There's no way to reliably get thinking models anymore. Before we had o4-mini, o4-mini-high and o3. Now we have GPT-5 Thinking with 200 messages per week and a router that exclusively routes you to some small and shitty non-reasoning model.

971

89,040

Teknium 🪽 · Mar 16, 2023 · 9:19 AM UTC

Teknium 🪽

@Teknium

16 Mar 2023

Incredible. I gave GPT-4 this insanely complex image, and it worked.

119

932

310,607

Teknium 🪽 · Nov 15, 2025 · 12:07 AM UTC

Teknium 🪽

@Teknium

15 Nov 2025

He must'a got a chance to try gemini 3 lol

Andrew Curran

@AndrewCurran_

14 Nov 2025

Warren Buffett has taken a $4.3 billion dollar stake in Alphabet.

978

75,524

Teknium 🪽 · May 15, 2024 · 7:11 AM UTC

Teknium 🪽

@Teknium

15 May 2024

😲👀

930

100,399

Teknium 🪽 · Jan 30, 2025 · 9:28 PM UTC

Teknium 🪽

@Teknium

30 Jan 2025

It even has some commented lines so technically bigger than needed

897

154,289

Teknium 🪽 · Jan 30, 2025 · 11:00 PM UTC

Teknium 🪽

@Teknium

30 Jan 2025

Replying to @marktenenholtz

People taking this as an its so easy anyone can do it are misinterpreting. Its so easy because of everyone in opensource building up and abstracting layers and providing reference implementations for others to build on. OpenSource ftw

910

54,792

Teknium 🪽 · Jan 31, 2025 · 12:09 AM UTC

Teknium 🪽

@Teknium

31 Jan 2025

The code: gist.github.com/willccbb/467…

GRPO Llama-1B

GRPO Llama-1B. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

935

110,240

Teknium 🪽 · Feb 16, 2025 · 9:12 AM UTC

Teknium 🪽

@Teknium

16 Feb 2025

I almost fell for o3-mini being good this is absurd

922

151,310

Teknium 🪽 · Feb 27, 2025 · 7:04 PM UTC

Teknium 🪽

@Teknium

27 Feb 2025

Guys its been 2+ years and 1000s' of times more capital has been deployed since gpt4.. what the hell happened

Teknium 🪽

@Teknium

27 Feb 2025

mmmmmmmmmmmmmmmmmmm almost no gains here either...?

143

917

137,071

Teknium 🪽 · Feb 28, 2025 · 7:48 AM UTC

Teknium 🪽

@Teknium

28 Feb 2025

I think this is the most insane part of 4.5 release to me. The knowledge cutoff is 2023 still. How do you even have a current pretraining run that didnt see data past 2023? So many API's and libraries from there are now deprecated, and so many new ones created.. Did chatgpt 3.5 data ruin 2024+ data? or was this made a long long time ago?

Enrico - big-AGI

@enricoros

27 Feb 2025

Replying to @Teknium @teknium

And don't use 4.5-preview for coding. Unless you love the ancient Oct 2023 updated knowledge about frameworks, and you love paying more for less.

905

159,173

Teknium 🪽 · Jul 13, 2025 · 1:13 AM UTC

Teknium 🪽

@Teknium

13 Jul 2025

Had no idea these existed

912

174,334

Teknium 🪽 · Mar 2, 2025 · 8:00 PM UTC

Teknium 🪽

@Teknium

2 Mar 2025

This is how degraded theyve made dalle since prelaunch

Adam.GPT

@TheRealAdamG

18 Jul 2022

"a portrait photo of a parrot sipping a fruity drink through a straw in Margaritaville" #DALLE

885

102,054

Teknium 🪽 · Dec 22, 2024 · 6:28 PM UTC

Teknium 🪽

@Teknium

22 Dec 2024

Ok tried Devin on two new tasks, one was automatically creating data visualizations of benchmarks, which took about 4 hours and a ton of back and forth debugging with it to get it .. mostly working. Second was having it just create a readme after looking over all the code. Hallucinated nearly everything. I can't recommend Devin especially at this price at this time to others. If you've had a better experience, let me know though! We have many hours left in this month's subscription.. Will try out more probably

892

134,846

Teknium 🪽 · Dec 3, 2024 · 10:57 AM UTC

Teknium 🪽

@Teknium

3 Dec 2024

I want to build a really cool home library like 200-300 books on a series of bookshelves Give me your like, 3 absolute top books, fiction or non-fiction, doesn't matter.

451

885

189,513

Teknium 🪽 · Nov 10, 2024 · 6:12 PM UTC

Teknium 🪽

@Teknium

10 Nov 2024

I hope Yann's descent into madness doesn't slow timelines for Llama-4 lol

879

46,954

Teknium 🪽 · Jun 11, 2025 · 10:15 PM UTC

Teknium 🪽

@Teknium

11 Jun 2025

Claude just casually deleting a full days work on an environment for no fucking reason - fuck you claude

109

885

79,072

Teknium 🪽 · Dec 20, 2024 · 4:34 AM UTC

Teknium 🪽

@Teknium

20 Dec 2024

If its o2/3 Im going to be severely dissapointed. All we want is gpt4.5/5 ffs lol

Sam Altman

@sama

20 Dec 2024

Replying to @sama

fine one clue should have said oh oh oh

864

159,526

Teknium 🪽 · Jan 29, 2024 · 4:52 PM UTC

Teknium 🪽

@Teknium

29 Jan 2024

Meta is releasing a new CodeLlama

860

75,922

Teknium 🪽 · Sep 30, 2023 · 5:41 AM UTC

Teknium 🪽

@Teknium

30 Sep 2023

Thanks DALLE3, everyone needed to know what pikachu's musculoskeletal system looked like

858

121,858

Teknium 🪽 · Apr 2, 2023 · 4:46 AM UTC

Teknium 🪽

@Teknium

2 Apr 2023

I have a true gift for LLM Devs and the opensource AI community. Several GPT-4 Generated datasets. Toolformer, Instruct, Roleplay-Instruct, and soon, Code-Instruct datasets, all generated from GPT-4. I hope I can give back more! Check them out here: github.com/teknium1/GPTeache…

GitHub - teknium1/GPTeacher: A collection of modular datasets generated by GPT-4, General-Instruct...

A collection of modular datasets generated by GPT-4, General-Instruct - Roleplay-Instruct - Code-Instruct - and Toolformer - teknium1/GPTeacher

github.com

144

882

133,405

Teknium 🪽 · Aug 31, 2024 · 12:23 AM UTC

Teknium 🪽

@Teknium

31 Aug 2024

I think a problem in a lot of ai companies right now is they are lacking a lot of creative types

836

85,406

Teknium 🪽 · Jun 22, 2024 · 6:47 PM UTC

Teknium 🪽

@Teknium

22 Jun 2024

I think I hate gpt4o

814

112,958

Teknium 🪽 · May 16, 2025 · 5:34 AM UTC

Teknium 🪽

@Teknium

16 May 2025

I switched off claude in cursor to gemini btw

111

869

72,175

Teknium 🪽 · Apr 18, 2024 · 4:07 PM UTC

Teknium 🪽

@Teknium

18 Apr 2024

835

248,793

Teknium 🪽 · Jan 27, 2025 · 2:23 AM UTC

Teknium 🪽

@Teknium

27 Jan 2025

What a dumb person

Gary Marcus

@GaryMarcus

27 Jan 2025

Congress needs to bring in Zuckerberg and LeCun to discuss how their unilateral open-sourcing decision rapidly undermined the US advantage in Generative AI. Tomorrow.

827

70,951