Oleksii Kuchaiev · Jun 4, 2026 · 1:28 PM UTC

Oleksii Kuchaiev

Pinned Tweet

Oleksii Kuchaiev

@kuchaev

Jun 4

Replying to @kuchaev

Our post-training pipeline is a substantial redesign from Super. The core idea: don't rely on stacked RL stages alone. We do SFT, multi-environment RLVR across a huge mix of agentic/reasoning/code/safety environments, then Multi-teacher On-Policy Distillation (MOPD). 10+ domain-specialized teachers, merged into the student via dense token-level guidance on its own rollouts. See Figures below for overview and tech report for all the details. 2/4

279

105,996

Oleksii Kuchaiev · Apr 8, 2025 · 3:13 AM UTC

Oleksii Kuchaiev

@kuchaev

8 Apr 2025

We are excited to release Llama-Nemotron-Ultra! This is a reasoning ON/OFF, dense 253B model. Open weights and post-training data. huggingface.co/nvidia/Llama-… We started with llama-405B, changed it via NAS pruning then followed by reasoning-focused post-training: SFT + RL in FP8.

123

701

166,499

Oleksii Kuchaiev · Jun 14, 2024 · 4:17 PM UTC

Oleksii Kuchaiev

@kuchaev

14 Jun 2024

Today we are happy to release best open models for synthetic data generation. 340B parameters, includes base, instruct and reward models. As well as new human preference dataset HelpSteer2. 340B-Reward model is #1 on the Reward Bench leaderboard. blogs.nvidia.com/blog/nemotr…

NVIDIA Releases Open Synthetic Data Generation Pipeline for Training Large Language Models

Nemotron-4 340B, a family of models optimized for NVIDIA NeMo and NVIDIA TensorRT-LLM, includes cutting-edge instruct and reward models, and a dataset for generative AI training.

blogs.nvidia.com

440

242,091

Oleksii Kuchaiev · May 16, 2025 · 4:45 PM UTC

Oleksii Kuchaiev

@kuchaev

16 May 2025

NeMo RL is now open source! It replaces NeMo-Aligner and is the toolkit we use to post train next generations of our models. Give it a try github.com/NVIDIA/NeMo-RL

GitHub - NVIDIA-NeMo/RL: Scalable toolkit for efficient model reinforcement

Scalable toolkit for efficient model reinforcement - NVIDIA-NeMo/RL

github.com

392

25,041

Oleksii Kuchaiev · May 5, 2025 · 3:28 AM UTC

Oleksii Kuchaiev

@kuchaev

5 May 2025

Llama-Nemotron-v1 technical report is now available on arxiv arxiv.org/pdf/2505.00949v1

343

28,816

Oleksii Kuchaiev · Aug 18, 2025 · 6:30 PM UTC

Oleksii Kuchaiev

@kuchaev

18 Aug 2025

We are excited to release Nvidia-Nemotron-Nano-V2 model! This is a 9B hybrid SSM model with open base model and training data. This model also supports runtime "thinking" budget control. HF collection with base and post trained models: huggingface.co/collections/n…

298

65,402

Oleksii Kuchaiev · Jul 30, 2025 · 4:20 PM UTC

Oleksii Kuchaiev

@kuchaev

30 Jul 2025

Everything about Llama-Nemotron-Super-V1.5 post-training is now open: Synthetic data: huggingface.co/datasets/nvid… Human data: huggingface.co/datasets/nvid… Reward models (trained on HS3 data): huggingface.co/collections/n… RL toolkit: github.com/NVIDIA-NeMo/RL

nvidia/Nemotron-Post-Training-Dataset-v1 · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

245

14,403

Oleksii Kuchaiev · Mar 16, 2024 · 7:50 PM UTC

Oleksii Kuchaiev

@kuchaev

16 Mar 2024

In LLM pre training, curating and preparing data is perhaps the most impactful step. NeMo data curator is now open source with lots of features you will need. We used it to curate trillions of tokens for our own models training. github.com/NVIDIA/NeMo-Curat…

GitHub - NVIDIA-NeMo/Curator: Scalable data pre processing and curation toolkit for LLMs

Scalable data pre processing and curation toolkit for LLMs - NVIDIA-NeMo/Curator

github.com

222

19,101

Oleksii Kuchaiev · Jul 2, 2025 · 4:49 PM UTC

Oleksii Kuchaiev

@kuchaev

2 Jul 2025

Post-training of LLMs is increasingly important and RLHF remains a necessary step for an overall great model. Today we are releasing 6 new reward models, including GenRMs and multilingual. These models are used to post-train next *-nemotron models. huggingface.co/collections/n…

Reward Models 06-2025 - a nvidia Collection

Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge

huggingface.co

203

13,358

Oleksii Kuchaiev · Jul 15, 2025 · 4:17 PM UTC

Oleksii Kuchaiev

@kuchaev

15 Jul 2025

If you are a researcher working on LLM post-training, RL and reasoning, you should really give NeMo-RL a try. Works with hugginface and megatron-core (when you need scale). Here is great blogpost by @AlexanderBukha1 and team on how to get started: nvidia-nemo.github.io/blog/2…

200

17,246

Oleksii Kuchaiev · Mar 18, 2025 · 7:22 PM UTC

Oleksii Kuchaiev

@kuchaev

18 Mar 2025

We are excited to release new Llama-Nemotron models. These models allow you to set reasoning ON/OFF during runtime. We also release all the post-training data under CC-BY-4! Try it now on build.nvidia.com/nvidia/llam… HF collection: huggingface.co/collections/n…

192

51,783

Oleksii Kuchaiev · Jul 25, 2025 · 11:43 PM UTC

Oleksii Kuchaiev

@kuchaev

25 Jul 2025

Very excited to announce Llama-Nemotron-Super-V1.5! Super-V1.5 is now better than Ultra-V1. This is currently the best model that can be deployed on a single H100. Reasoning On/Off and drop in replacement for V1. Open-weight, code and data on HF huggingface.co/nvidia/Llama-…

184

42,622

Oleksii Kuchaiev · Oct 6, 2025 · 1:28 AM UTC

Oleksii Kuchaiev

@kuchaev

6 Oct 2025

✈️to COLM2025. And I am looking for exceptional RL and post-training engineers who are excited to push frontiers of open-source post-training and open models such as Nemotron. • At the conference? Message me on Whova. • Not attending? DMs are open. Send your CV & a short note.

190

62,215

Oleksii Kuchaiev · Apr 9, 2025 · 7:16 PM UTC

Oleksii Kuchaiev

@kuchaev

9 Apr 2025

We just updated Llama-Nemotron post training dataset with additional 2.2M math and 500K code reasoning examples used in Llama-Nemotron-Ultra training huggingface.co/datasets/nvid…

nvidia/Llama-Nemotron-Post-Training-Dataset · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

clem 🤗

@ClementDelangue

8 Apr 2025

What's cool about @nvidia is that in addition to models, they release tons of cool datasets! Why are the other big tech not doing that too? huggingface.co/nvidia

157

18,158

Oleksii Kuchaiev · Jun 14, 2024 · 5:49 PM UTC

Oleksii Kuchaiev

@kuchaev

14 Jun 2024

For those wondering, "june-chatbot" on @lmsysorg is exactly this model posted on @huggingface huggingface.co/nvidia/Nemotr…

nvidia/Nemotron-4-340B-Instruct · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

133

20,980

Oleksii Kuchaiev · Mar 7, 2025 · 6:36 PM UTC

Oleksii Kuchaiev

@kuchaev

7 Mar 2025

New paper from our team. An inference-time scaling approach which can boost non-math benchmarks such as Arena-Hard of existing models. We get Arena-Hard of 92.7 for 70B model. As of 5 Mar 2025, surpassing o1-preview-2024-09- 12 (90.4) and DS-R1 (92.3). arxiv.org/pdf/2503.04378

130

23,095

Oleksii Kuchaiev · Jul 29, 2025 · 12:56 AM UTC

Oleksii Kuchaiev

@kuchaev

29 Jul 2025

Llama-Nemotron-Super-V1.5 got AA intelligence index of 64. This is more than previous Ultra (61) model and is by far the "smartest" open-weights dense model. The best model for deployment on a single H100. Head to @ArtificialAnlys for detailed analysis artificialanalysis.ai/models…

134

5,663

Oleksii Kuchaiev · Jul 7, 2022 · 11:04 PM UTC

Oleksii Kuchaiev

@kuchaev

7 Jul 2022

NeMo speech recognition models published on @huggingface hub are now at the top of HF Speech Bench for all languages where we published the models so far: English, German, French and Chinese. Often beating other models without external LMs and with fewer parameters. #DeepLearning

121

Oleksii Kuchaiev · Jun 6, 2025 · 5:53 PM UTC

Oleksii Kuchaiev

@kuchaev

6 Jun 2025

New reasoning Nemotron-H models are now publicly available. These models are based on hybrid architecture! 47B and 8B in BF16 and FP8. Blogpost: developer.nvidia.com/blog/ne… Weights: huggingface.co/collections/n…

Introducing the Nemotron-H Reasoning Model Family: Throughput Gains Without Compromise | NVIDIA...

As large language models increasingly take on reasoning-intensive tasks in areas like math and science, their output lengths are getting significantly longer—sometimes spanning tens of thousands of…

developer.nvidia.com

Adi Renduchintala @rendu_a

6 Jun 2025

Transformers are still dominating the LLM scene but we show that higher throughput alternatives exist which are just as strong! Grateful to have a part in Nemotron-H Reasoning effort. 🙏 Technical report will be out soon, stay tuned!

122

23,844

Oleksii Kuchaiev · Aug 12, 2025 · 8:38 PM UTC

Oleksii Kuchaiev

@kuchaev

12 Aug 2025

New VLM training data drop!

NVIDIA AI Developer

@NVIDIAAIDev

12 Aug 2025

We just released 3 million samples of high quality vision language model training dataset for use cases such as: 📄 optical character recognition (OCR) 📊 visual question answering (VQA) 📝 captioning 🤗 Learn more: nvda.ws/4oyfevu 📥 Download: nvda.ws/4fz2gtB

116

5,481

Oleksii Kuchaiev · Apr 8, 2025 · 3:19 AM UTC

Oleksii Kuchaiev

@kuchaev

8 Apr 2025

Replying to @teortaxesTex

We used R1 as teacher for lots of things (see diagram). But to push scientific reasoning (GPQA) beyond R1's number (71.5=>76) it took a big reasoning RL (GRPO) run in FP8.

118

4,178

Oleksii Kuchaiev · Oct 24, 2024 · 12:36 AM UTC

Oleksii Kuchaiev

@kuchaev

24 Oct 2024

Llama-3.1-Nemotron-70B-Instruct model aligned by our team is now live on lmarena.ai leaderboard with overall rank 9. Everything used to create this model is public: code, data and reward model. HF checkpoint: huggingface.co/nvidia/Llama-…

34,657

Oleksii Kuchaiev · Jul 25, 2025 · 6:17 PM UTC

Oleksii Kuchaiev

@kuchaev

25 Jul 2025

NeMo-RL team keeps shipping! v0.3.0 release adds @deepseek_ai's DeepSeek-V3 support as well as @Alibaba_Qwen' Qwen 3 models. github.com/NVIDIA-NeMo/RL/re…

Release Release 0.3.0 · NVIDIA-NeMo/RL

🚀 Release v0.3.0 📝 Blog Our latest blog post shares highlights and progress from recent work—take a look! ✨ Highlights 🏗️ Improved Training Throughput and Scalability via Megatron-Core Backend In...

github.com

11,188

Oleksii Kuchaiev · Nov 17, 2023 · 5:12 PM UTC

Oleksii Kuchaiev

@kuchaev

17 Nov 2023

Our team just released a new dataset for LLM model alignment, called HelpSteer huggingface.co/datasets/nvid… on @huggingface Hub under CC-BY-4 license! This one should be used for reward model training, especially with SteerLM method.

nvidia/HelpSteer · Datasets at Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

22,358

Oleksii Kuchaiev · Oct 11, 2023 · 8:43 PM UTC

Oleksii Kuchaiev

@kuchaev

11 Oct 2023

Our team is happy to share SteerLM, a simpler alternative to RLHF which allows dynamic model controls during inference (humor, verbosity, etc.). To appear in Findings of EMNLP 2023. It is implemented in NeMo (open-source) and example model is on HF arxiv.org/abs/2310.05344

17,435

Oleksii Kuchaiev · Mar 21, 2025 · 4:16 PM UTC

Oleksii Kuchaiev

@kuchaev

21 Mar 2025

Llama-3.3-Nemotron-Super-49B-v1 is on LMArena leaderboard. Head to the huggingface.co/collections/n… for entire post-training data and model weights! Or try it now from the browser on build.nvidia.com

Llama Nemotron - a nvidia Collection

Open, Production-ready Enterprise Models

huggingface.co

Arena.ai

@arena

21 Mar 2025

New on LMArena: @Nvidia's Llama-3.3-Nemotron-Super-49B-v1 lands at #14! A powerful open reasoning model—top-15 overall, excelling in math, with an openly released 15M post-training dataset. Congrats to the @NvidiaAI Nemo team for this fantastic contribution to the open community!

10,659

Oleksii Kuchaiev · Feb 4, 2025 · 5:15 PM UTC

Oleksii Kuchaiev

@kuchaev

4 Feb 2025

Our team put together a unified mathematical framework to analyze popular model alignment algorithms. “Reward-aware Preference Optimization: A Unified Mathematical Framework for Model Alignment” arxiv.org/pdf/2502.00203

6,585

Oleksii Kuchaiev · Oct 6, 2020 · 10:14 PM UTC

Oleksii Kuchaiev

@kuchaev

6 Oct 2020

Very happy to share our latest NeMo release. We re-designed NeMo to work with @PyTorchLightnin and @Hydra_Framework projects from @PyTorch ecosystem. Train your own #ASR, #NLP and #TTS models or re-use one of the many pre-trained models we have.

PyTorch

@PyTorch

6 Oct 2020

NeMo, @NVIDIA’s open-source toolkit based on #PyTorch, allows you to quickly build, train, and fine-tune conversational AI models. See how speech recognition, natural language processing and speech synthesis can be improved in this tutorial: bit.ly/nvidia-nemo-introduct…

Oleksii Kuchaiev · Sep 29, 2024 · 8:47 PM UTC

Oleksii Kuchaiev

@kuchaev

29 Sep 2024

Thank you @GavinNewsom for veto on SB 1047. @WSJ , no most researchers did not support that bill, only a small minority of them did. Application layer is a place for AI regulation, not fundamental model development.

5,253

Oleksii Kuchaiev · Jun 17, 2024 · 9:57 PM UTC

Oleksii Kuchaiev

@kuchaev

17 Jun 2024

You can try Nemotron-4-340B-Instruct model here build.nvidia.com/nvidia/nemo…

8,337

Oleksii Kuchaiev · Jun 18, 2024 · 2:22 AM UTC

Oleksii Kuchaiev

@kuchaev

18 Jun 2024

Replying to @agihippo

I guess you missed the part where we used 1000x less human data (10K vs 10M) for alignment than llama3. this is about synthetic data generation, literally says so in the blogpost. Also we released reward model and training data for it, all under commercial friendly license.

2,241

Oleksii Kuchaiev · Jul 31, 2025 · 3:16 PM UTC

Oleksii Kuchaiev

@kuchaev

31 Jul 2025

Replying to @ClementDelangue

One exception - NVIDIA. All you need to do is get your manager's OK to publish and in all my time here I've never seen that denied or even delayed.

2,399

Oleksii Kuchaiev · Jul 8, 2024 · 4:20 PM UTC

Oleksii Kuchaiev

@kuchaev

8 Jul 2024

Replying to @razomforukraine

@POTUS and @WhiteHouse you must act on this

839

Oleksii Kuchaiev · Apr 15, 2024 · 6:28 PM UTC

Oleksii Kuchaiev

@kuchaev

15 Apr 2024

Replying to @srush_nlp

Maybe it is because de-noising objective is "wasting" tokens compared to autoregressive models. E.g. when you mask 15% of tokens, then after 1 epoch you've backpropogated loss from only 15% of your tokens, compared to 100% in next token prediction loss.

6,238

Oleksii Kuchaiev · Dec 6, 2023 · 1:42 AM UTC

Oleksii Kuchaiev

@kuchaev

6 Dec 2023

NeMo Aligner, a toolkit for scalable AI model alignment is now Open Source! github.com/NVIDIA/NeMo-Align… Currently it support PPO-based RLHF, SteerLM and DPO. #EMNLP2023 #nlproc

GitHub - NVIDIA/NeMo-Aligner: Scalable toolkit for efficient model alignment

Scalable toolkit for efficient model alignment. Contribute to NVIDIA/NeMo-Aligner development by creating an account on GitHub.

github.com

2,750

Oleksii Kuchaiev · Jun 18, 2024 · 11:07 PM UTC

Oleksii Kuchaiev

@kuchaev

18 Jun 2024

Nemotron-4-340B-*Reward* model is now available via API on build.nvidia.com/nvidia/nemo… :) Give it a try.

3,098

Oleksii Kuchaiev · Jun 14, 2024 · 4:18 PM UTC

Oleksii Kuchaiev

@kuchaev

14 Jun 2024

Models on @huggingface under Nvidia Open Model License huggingface.co/collections/n…

Nemotron 4 340B - a nvidia Collection

Nemotron-4: open models for Synthetic Data Generation (SDG). Includes Base, Instruct, and Reward models.

huggingface.co

3,656

Oleksii Kuchaiev · Mar 19, 2025 · 4:54 PM UTC

Oleksii Kuchaiev

@kuchaev

19 Mar 2025

HelpSteer3 data is public now! 40K prompts with 2 responses each. It has ratings, justifications for them, feedback on responses, and edited responses. CC-BY-4.0 and collected from professional human raters. Using this data we can get 93.4 on Arena-Hard. huggingface.co/collections/n…

Llama Nemotron Feedback-Edit Inference-Time Scaling - a nvidia Collection

Novel ITS approach for open-ended tasks - No. 1 on Arena Hard on 18 Mar 2025

huggingface.co

2,677

Oleksii Kuchaiev · May 17, 2025 · 1:31 AM UTC

Oleksii Kuchaiev

@kuchaev

17 May 2025

I don’t understand why people think AI will make coding, SW jobs or CS obsolete. Instead an order of magnitude more people will do it and will do it much earlier and be 100x more productive.

1,682

Oleksii Kuchaiev · Apr 7, 2025 · 9:50 PM UTC

Oleksii Kuchaiev

@kuchaev

7 Apr 2025

New post-training data drop!

Somshubra Majumdar @HaseoX94

7 Apr 2025

Open Code Reasoning is our latest dataset to train SOTA code reasoning capabilities in all model sizes ! With it, even 7B Qwen can reach 51% on LiveCodeBench, 32B hits 61% with just SFT alone ! Model release soon, paper and dataset are out !

3,500

Oleksii Kuchaiev · Sep 30, 2024 · 8:27 PM UTC

Oleksii Kuchaiev

@kuchaev

30 Sep 2024

Reward modeling is a key step of AI development. Our team just released new SOTA RM on build.nvidia.com/nvidia/llam…. We also release model’s weights on @huggingface and updated (with new dimensions) version of HelpSteer 2 dataset (ccby4) used to train it. huggingface.co/nvidia/Llama-…

5,965

Oleksii Kuchaiev · Aug 4, 2022 · 9:50 PM UTC

Oleksii Kuchaiev

@kuchaev

4 Aug 2022

NeMo now has Ukrainian speech recognition model on @huggingface hub. This is a CitriNet model tuned by our intern working from Kyiv huggingface.co/nvidia/stt_uk… As of today, I think, this is the best Ukrainian ASR model freely available.

Oleksii Kuchaiev · Nov 28, 2023 · 10:40 PM UTC

Oleksii Kuchaiev

@kuchaev

28 Nov 2023

Check out llama2-70B-SteerLM model which gets 7.54 on MT-bench. This model is NOT using outputs of stronger (ChatGPT) models during alignment which allowed us to keep llama2 license. Try now on NGC Catalog catalog.ngc.nvidia.com/orgs/… . Also on @huggingface hub huggingface.co/nvidia/Llama2…

2,597

Oleksii Kuchaiev · Jul 30, 2025 · 4:10 PM UTC

Oleksii Kuchaiev

@kuchaev

30 Jul 2025

Happy to help do our part towards open AI 🙂

clem 🤗

@ClementDelangue

30 Jul 2025

Fun visualization by @aiworld_eu! @nvidia added 365 public model/dataset/apps on @huggingface in the past 12 months (one a day!) aiworld.eu/story/nvidia-domi…

1,085

Oleksii Kuchaiev · Apr 14, 2025 · 4:59 PM UTC

Oleksii Kuchaiev

@kuchaev

14 Apr 2025

New Nemotrons - mamba-transformer hybrid base models are now on @huggingface Hub!

Mostofa Patwary @mapatwary

14 Apr 2025

Nemotron-H base models (8B/47B/56B): A family of Hybrid Mamba-Transformer LLMs are now available on HuggingFace: huggingface.co/nvidia/Nemotr… huggingface.co/nvidia/Nemotr… huggingface.co/nvidia/Nemotr… Technical Report: arxiv.org/abs/2504.03624 Blog: research.nvidia.com/labs/adl…

1,418

Oleksii Kuchaiev · Jul 17, 2025 · 12:24 AM UTC

Oleksii Kuchaiev

@kuchaev

17 Jul 2025

✈️ to ICML workshops to talk about the first open-weight model that outsmarted original DS-R1 on AA index. Happy to chat all things post-training and AI in general. (The poster is EXAIT workshop this Saturday)

1,732

Oleksii Kuchaiev · Jun 13, 2025 · 12:07 AM UTC

Oleksii Kuchaiev

@kuchaev

13 Jun 2025

Replying to @teortaxesTex

Meta is likely making a mistake doubling down on imitation learning (labeled data) in the era of exploration learning.

1,343

Oleksii Kuchaiev · Jun 12, 2025 · 8:37 PM UTC

Oleksii Kuchaiev

@kuchaev

12 Jun 2025

AI model post training is rapidly improving. The plot below (starting from the same base model) illustrates about 10 months of progress in the *open* post-training research. I’m not convinced that closed research can move as fast.

1,477

Oleksii Kuchaiev · Jun 14, 2022 · 2:19 AM UTC

Oleksii Kuchaiev

@kuchaev

14 Jun 2022

Oh wow! Look what model is #1 on @huggingface speech bench for English speech recognition

Somshubra Majumdar @HaseoX94

14 Jun 2022

The largest NeMo ASR model is finally public on @huggingface ! This is a 600 M params Conformer Transducer X-Large, probably the largest public checkpoint trained with multiple datasets. huggingface.co/nvidia/stt_en…

Oleksii Kuchaiev · Apr 8, 2025 · 11:12 PM UTC

Oleksii Kuchaiev

@kuchaev

8 Apr 2025

Try it now on build.nvidia.com

NVIDIA AI Developer

@NVIDIAAIDev

8 Apr 2025

🎊 Llama Nemotron Ultra 253B is here 🎊 ✅ 4x higher inference throughput over DeepSeek R1 671B 🏆Highest accuracy on reasoning benchmarks: 💎 GPQA-Diamond for advanced scientific reasoning 💎 AIME 2024/25 for complex math 💎 LiveCodeBench for code generation and completion Try as #NVIDIANIM ➡️ build.nvidia.com/nvidia/llam… Technical deep dive ➡️ developer.nvidia.com/blog/bu…

3,179

Oleksii Kuchaiev · Jan 24, 2020 · 5:49 PM UTC

Oleksii Kuchaiev

@kuchaev

24 Jan 2020

A paper arxiv.org/pdf/1910.10261.pdf about our latest speech recognition model - QuartzNet has been accepted to #ICASSP 2020. Head over to github.com/NVIDIA/NeMo for implementation and pretrained models. #DeepLearning #asr

Oleksii Kuchaiev · Aug 3, 2024 · 7:47 PM UTC

Oleksii Kuchaiev

@kuchaev

3 Aug 2024

Replying to @paulg

I grew up near Chernobyl and have always belived that its most devastating impact is a subsequent push back against nuclear power. Interesting fact - the Chernobyl station kept working after disaster (1986) until 1999 when it was fully shut down under the Western pressure.

565

Oleksii Kuchaiev · Sep 17, 2025 · 9:27 PM UTC

Oleksii Kuchaiev

@kuchaev

17 Sep 2025

Do you want to work on LLM and DLM model post-training with us? @JiantaoJ is hiring! nvidia.wd5.myworkdayjobs.com…

2,681

Oleksii Kuchaiev · Jun 18, 2024 · 2:19 AM UTC

Oleksii Kuchaiev

@kuchaev

18 Jun 2024

Replying to @fleetingbits @lmsysorg @NVIDIAAI

Another way to put it: we used 1000x less (10K vs 10M) human data for alignment than llama3 by using synthetic data. This release is about synthetic data generation which is why our license explicitly allows it.

371

Oleksii Kuchaiev · Jun 17, 2024 · 11:01 PM UTC

Oleksii Kuchaiev

@kuchaev

17 Jun 2024

Replying to @arena @lmsysorg @NVIDIAAI

With a license that permits synthetic data generation and commercial use.

1,218

Oleksii Kuchaiev · Jun 22, 2024 · 10:35 PM UTC

Oleksii Kuchaiev

@kuchaev

22 Jun 2024

Replying to @fchollet

people claiming that some LLM is at "high schooler" or even "kindergartner" level should spend more time around kids. No AI system today is near the level of even 3 year old when it comes to general intelligence. Moreover, the progress towards that level is unclear.

2,533

Oleksii Kuchaiev · Jul 29, 2025 · 1:11 AM UTC

Oleksii Kuchaiev

@kuchaev

29 Jul 2025

Replying to @kuchaev @ArtificialAnlys

Post training is an exciting area with a lot of gains to be had. This model is built starting from llama-3.3-70B-instruct, AA index of 41. So +23 points from redoing post training stage.

953

Oleksii Kuchaiev · Aug 29, 2024 · 12:51 AM UTC

Oleksii Kuchaiev

@kuchaev

29 Aug 2024

Replying to @AnimaAnandkumar @drfeifei @ylecun @Caltech

@GavinNewsom , respectfully asking you to veto SB1047. As written it will hurt AI innovation in California.

2,071

Oleksii Kuchaiev · Feb 5, 2022 · 7:01 AM UTC

Oleksii Kuchaiev

@kuchaev

5 Feb 2022

Just finished listening to "Viral" audiobook by @Ayjchan and @mattwridley. This is an excellent account of the most likely origins of COVID-19 pandemic. A must read for anyone (which really should be everyone) interested in the the origins of #COVID19 audible.com/pd/Viral-Audiobo…

Oleksii Kuchaiev · Apr 11, 2023 · 6:20 AM UTC

Oleksii Kuchaiev

@kuchaev

11 Apr 2023

Replying to @mattrickard

you missed NeMo models huggingface.co/models?sort=d… (1, 5 and 20B GPTs with commercialy friendly license)

Models – Hugging Face

Explore machine learning models.

huggingface.co

2,519

Oleksii Kuchaiev · Oct 11, 2024 · 4:00 PM UTC

Oleksii Kuchaiev

@kuchaev

11 Oct 2024

Replying to @natolambert

have you tried recent FSD though? I ‘ve tried it a year ago and thought only a fool would pay for that. But I find the recent version is so good that I am paying now and don’t want to turn it off. My driving is mostly suburban or longer trips.

1,039

Oleksii Kuchaiev · Dec 16, 2022 · 6:10 PM UTC

Oleksii Kuchaiev

@kuchaev

16 Dec 2022

Two very real steps anyone in the world can help: 1) Consider donating to humanitarian relief efforts, such as @razomforukraine and there are many others. If your company has a match - please make sure you make use of that. #UkraineRussiaWar

865

Oleksii Kuchaiev · Apr 24, 2025 · 8:51 AM UTC

Oleksii Kuchaiev

@kuchaev

24 Apr 2025

If you are at #ICLR and looking for applied research roles in model post-training, reasoning and model alignment - message me and I’ll be happy to chat.

3,404

Oleksii Kuchaiev · Jun 14, 2024 · 7:19 PM UTC

Oleksii Kuchaiev

@kuchaev

14 Jun 2024

Replying to @geoff_l

generating synthetic data for alignment of smaller models is key use case we have in mind.

1,432

Oleksii Kuchaiev · Nov 29, 2023 · 5:30 PM UTC

Oleksii Kuchaiev

@kuchaev

29 Nov 2023

Replying to @natolambert @srush_nlp

Yes, PPO is much more expensive then DPO in terms of infra but in all our experiments so far, on the same data (and no online setting) PPO>DPO on MT-bench.

635

Oleksii Kuchaiev · Dec 3, 2022 · 11:14 PM UTC

Oleksii Kuchaiev

@kuchaev

3 Dec 2022

#ChatGPT is very impressive! In the dialogue below, it comes up with a suboptimal solution, and argues a little without admitting a mistake. Then takes a hint, admits the mistake, and fixes its solution!

Oleksii Kuchaiev · Sep 23, 2025 · 5:37 PM UTC

Oleksii Kuchaiev

@kuchaev

23 Sep 2025

A great video by @ctnzr on what Nemotron is and why it is open piped.video/_y9SEtn1lU8?si=65jm… !

1,449

Oleksii Kuchaiev · Oct 3, 2024 · 3:53 PM UTC

Oleksii Kuchaiev

@kuchaev

3 Oct 2024

This is currently the best 8B instruct model on most benchmarks

Pavlo Molchanov

@PavloMolchanov

3 Oct 2024

Excited to introduce MN-Minitron-8B-Instruct 📗! We've developed an even more powerful instruct model than its parent, Mistral-NeMo-12B, with significant improvements over LLaMa3.1-8B-Instruct as well! Weights on HF: huggingface.co/nvidia/Mistra… Demo: build.nvidia.com/nvidia/mist… Our new model outperforms LLaMa3.1-8B-Instruct on key benchmarks, including: 🧮 Math reasoning 🔧 Function calling 🧑‍🏫 Instruction following Additionally, our model improves 7 out of 8 metrics of the parent 12B. This model is a result of combining pruning and distillation, reducing the original Mistral-NeMo-12B-Base model to an efficient 8B, followed by alignment with NeMo Aligner. Thanks to the community for support that encourages us to release more models! 💡Useful links: NeMo-Aligner: github.com/NVIDIA/NeMo-Align… Minitron paper: arxiv.org/abs/2408.11796

957

Oleksii Kuchaiev · Aug 23, 2025 · 9:40 AM UTC

Oleksii Kuchaiev

@kuchaev

23 Aug 2025

Replying to @arthurmensch

Very impressive!

407

Oleksii Kuchaiev · Oct 18, 2024 · 11:09 PM UTC

Oleksii Kuchaiev

@kuchaev

18 Oct 2024

this is an excellent work from NVIDIA to speed convergence speed of transformers with algorithmic modifications

Tanishq Mathew Abraham, Ph.D.

@iScienceLuvr

10 Oct 2024

Normalized Transformer - tricks to keep the activations constrained, improves training convergence; from NVIDIA Was pointed to this paper by lucidrains arxiv.org/abs/2410.01131

1,017

Oleksii Kuchaiev · Nov 12, 2025 · 9:06 PM UTC

Oleksii Kuchaiev

@kuchaev

12 Nov 2025

If someone approaches you to talk about "agents", always ask them for their definition of what it is. Often a good signal on whether to continue or avoid the conversation.

718

Oleksii Kuchaiev · Oct 30, 2025 · 4:19 PM UTC

Oleksii Kuchaiev

@kuchaev

30 Oct 2025

New Reward modeling research and models from our team! 1. "RLBFF: Binary Flexible Feedback to bridge between Human Feedback & Verifiable Rewards" 2. "Think Twice: Branch-and-Rethink Reasoning Reward Model" As usual, models are on @huggingface Hub. Links in the reply.

978

Oleksii Kuchaiev · Dec 16, 2022 · 6:08 PM UTC

Oleksii Kuchaiev

@kuchaev

16 Dec 2022

This holiday seasons please consider supporting Ukraine which is being attacked by the terrorist federation. Today 76 rockets were fired at critical civilian infrastructure with typical terrorist intents of generating fear and humanitarian disaster.

1,399

Oleksii Kuchaiev · Oct 8, 2024 · 1:09 PM UTC

Oleksii Kuchaiev

@kuchaev

8 Oct 2024

if you are at #COLM2024, checkout our work - NeMo Aligner. Poster #3 during morning session today.

987

Oleksii Kuchaiev · Jun 4, 2021 · 6:58 PM UTC

Oleksii Kuchaiev

@kuchaev

4 Jun 2021

NeMo 1.0.0 is ready! developer.nvidia.com/blog/ac… #asr #NLProc #DeepLearning #AI

Accelerating Conversational AI Research with New Cutting-Edge Neural Networks and Features from...

The 1.0 update brings significant architectural, code quality, and documentation improvements as well as a plethora of new state-of-the-art neural networks and pretrained checkpoints in several…

developer.nvidia.com

Oleksii Kuchaiev · Jan 20, 2025 · 6:44 PM UTC

Oleksii Kuchaiev

@kuchaev

20 Jan 2025

Replying to @natolambert

Not having value function is a huge advantage of GRPO and REINFORCE in the context of LLMs. This is because value function (critique model) is supposed to assign values to *partial* generations which is fundamentally hard.

1,016

Oleksii Kuchaiev · Oct 6, 2024 · 10:36 PM UTC

Oleksii Kuchaiev

@kuchaev

6 Oct 2024

looking forward to interesting conference! DM me if want to chat.

Conference on Language Modeling @COLM_conf

6 Oct 2024

Quiet before the storm

5,646

Oleksii Kuchaiev · May 28, 2025 · 10:36 PM UTC

Oleksii Kuchaiev

@kuchaev

28 May 2025

NVIDIA Blackwell: The Journey From Die to Data Center piped.video/1la6fMl7xNA?si=aAzk… via @YouTube

NVIDIA Blackwell: The Journey From Die to Data Center

Embark on a journey to explore the building of NVIDIA Blackwell, st...

youtube.com

1,607

Oleksii Kuchaiev · Aug 18, 2025 · 6:49 PM UTC

Oleksii Kuchaiev

@kuchaev

18 Aug 2025

Tech report link research.nvidia.com/labs/adl…

907

Oleksii Kuchaiev · Sep 27, 2024 · 9:15 PM UTC

Oleksii Kuchaiev

@kuchaev

27 Sep 2024

Replying to @hamishivi @yizhongwyz @liujc1998 @zeqiuwu1 @valentina__py @natolambert @YejinChoinka @HannaHajishirzi

This is an excellent paper, we found very useful for those working on alignment / post training. Congrats on NeurIPS acceptance!

437

Oleksii Kuchaiev · Aug 8, 2025 · 5:45 PM UTC

Oleksii Kuchaiev

@kuchaev

8 Aug 2025

Replying to @natolambert

we have some base models here: huggingface.co/collections/n…

Nemotron-H - a nvidia Collection

Mamba-Transformer hybrid models

huggingface.co

771

Oleksii Kuchaiev · Sep 5, 2024 · 6:45 PM UTC

Oleksii Kuchaiev

@kuchaev

5 Sep 2024

Replying to @janleike

disagree. @GavinNewsom should veto this bill as it focuses on regulating model development rather than their applications.

340

Oleksii Kuchaiev · Apr 21, 2025 · 10:42 PM UTC

Oleksii Kuchaiev

@kuchaev

21 Apr 2025

I’ll be in Singapore attending ICLR2025. Looking forward to chatting in person about model post-training, alignment and reasoning! ✈️🇸🇬

501

Oleksii Kuchaiev · Aug 18, 2025 · 7:38 PM UTC

Oleksii Kuchaiev

@kuchaev

18 Aug 2025

Replying to @Teknium @teknium

This is a "runtime" feature. We started with same approach as Qwen3 but noticed that the model starts "thinking" outside of the thinking trace of forced to answer. Training on truncated thinking traces fixed that. Section 3.4 research.nvidia.com/labs/adl…

1,522

Oleksii Kuchaiev · Dec 16, 2020 · 10:06 PM UTC

Oleksii Kuchaiev

@kuchaev

16 Dec 2020

Thank you, Mozilla Common Voice! (New ASR models in NeMo are coming ;) ) —— 2020 End-of-Year Common Voice Dataset Release - Common Voice - Mozilla Discourse discourse.mozilla.org/t/2020…

2020 End-of-Year Common Voice Dataset Release

Happy end of 2020! While it has been a tumultuous year for all, the Common Voice team is excited to announce the end of year data set release! Firstly, we could not have made it through this year...

discourse.mozilla.org

Oleksii Kuchaiev · Aug 18, 2025 · 6:36 PM UTC

Oleksii Kuchaiev

@kuchaev

18 Aug 2025

This is how runtime reasoning budget model control works for 9B. You can limit "thinking" token budget, forcing model to produce an answer. We also made the model to not think outside of the thinking trace in such cases.

1,218

Oleksii Kuchaiev · Aug 7, 2024 · 9:32 PM UTC

Oleksii Kuchaiev

@kuchaev

7 Aug 2024

Replying to @karpathy

It seems like the recipe to achieving superhuman performance in domain X is now well known: have perfect reward model in X (e.g. game rules, physics) + Good transformer-based heuristic + graph reasoning (MCTS/A*, etc.). Perfect reward model for LLMs would indeed be a game changer

477

Oleksii Kuchaiev · Mar 25, 2025 · 7:20 PM UTC

Oleksii Kuchaiev

@kuchaev

25 Mar 2025

Replying to @JeffDean @lmarena_ai

very impressive model! congratulations.

239

Oleksii Kuchaiev · Apr 19, 2025 · 4:56 AM UTC

Oleksii Kuchaiev

@kuchaev

19 Apr 2025

Replying to @garrytan

I wish more people move to AIME2025 in 2025.

1,251

Oleksii Kuchaiev · Jan 20, 2025 · 9:00 PM UTC

Oleksii Kuchaiev

@kuchaev

20 Jan 2025

Replying to @natolambert

I wonder what is your take on this

1,031

Oleksii Kuchaiev · Apr 24, 2025 · 12:59 PM UTC

Oleksii Kuchaiev

@kuchaev

24 Apr 2025

If you are at #ICLR25 stop by our poster #239 04/25 at poster session 4 (3:00pm-5:30pm): "HelpSteer2-Preference: Complementing Ratings with Preferences". Will be happy to chat about data collection and RLHF. P.S. HelpSteer3 is already available: huggingface.co/datasets/nvid…

458

Oleksii Kuchaiev · Jan 4, 2022 · 10:37 PM UTC

Oleksii Kuchaiev

@kuchaev

4 Jan 2022

We have an exciting new job opportunity for NLP researcher on our team. Please check job description and apply here if interested nvidia.wd5.myworkdayjobs.com… #NLProc #NLP #OpenSource

Work With us and Transform Industries

Learn about our culture and much more. #NVIDIA Careers.

nvidia.com

Oleksii Kuchaiev · Sep 11, 2025 · 7:26 PM UTC

Oleksii Kuchaiev

@kuchaev

11 Sep 2025

OSS 💪

Interconnects

@interconnectsai

11 Sep 2025

Latest open artifacts (#14): NVIDIA's rise, "Swiss & UAE DeepSeek," and a resurgence of open data While Qwen takes some rest, others continue to fuel the open model space. interconnects.ai/p/latest-op… 43 of the best models/datasets from 27 different organizations. Featuring: NVIDIA (@nvidia) x6 Swiss National Supercomputing Centre (@cscsch) Ant Group (@AntGroup) x2 Hugging Face (@huggingface) x2 ByteDance (@BytedanceTalk) x2 DeepSeek (@deepseek_ai) Meituan (@Meituan_LongCat) Moonshot AI (@Kimi_Moonshot) Baidu (@Baidu_Inc) Cohere (@Cohere_Labs) x2 OpenBMB (@OpenBMB) x2 Tilde (@tilderesearch) Liquid AI (@liquidai) Meta (@Meta) Alibaba AIDC (@AI_AlibabaInt) Baichuan AI (@BaichuanAI) Allen AI (@allen_ai) x2 Tencent (@TencentGlobal) x3 Microsoft (@Microsoft) x2 LLM360 (@llm360) Jan (@jandotai) Google (@Google) x2 IBM (@IBM) x2 JHU CLSP (@jhuclsp) Qwen (@Alibaba_Qwen) Motif Technologies Skywork (@Skywork_ai)

1,434

Oleksii Kuchaiev · Jan 17, 2024 · 8:58 PM UTC

Oleksii Kuchaiev

@kuchaev

17 Jan 2024

Our team is hiring Sr. Applied Scientists to work on AI model alignment and customization (text and multimodal). If you have strong track record and experience with: LLMs or RL or multi-modal, please apply. Can be in-person or remote. #NLP #hiring nvidia.wd5.myworkdayjobs.com…

735

Oleksii Kuchaiev · Nov 20, 2023 · 6:46 PM UTC

Oleksii Kuchaiev

@kuchaev

20 Nov 2023

Our team has studied the tradeoffs between performance and the number of trainable params in LoRA. This work would be especially useful to those building and scaling AI customization services. Great work by @rendu_a and Tugrul Konuk arxiv.org/abs/2311.09578

352

Oleksii Kuchaiev · Dec 13, 2024 · 10:56 PM UTC

Oleksii Kuchaiev

@kuchaev

13 Dec 2024

Replying to @natolambert

if one has strong enough synthetic data pipeline, do they even need pre-training on Internet tokens… 🤔

3,075

Oleksii Kuchaiev · Apr 28, 2023 · 10:23 PM UTC

Oleksii Kuchaiev

@kuchaev

28 Apr 2023

Replying to @williamfalcon

You missed some of the open-source, commercially friendly (CC-BY-4) models built using lightning :) huggingface.co/models?librar…

Models compatible with the NeMo library – Hugging Face

Explore machine learning models.

huggingface.co

687

Oleksii Kuchaiev · May 7, 2024 · 7:01 PM UTC

Oleksii Kuchaiev

@kuchaev

7 May 2024

Latest release of NeMo-Aligner adds TRT-LLM integration which speeds up RLHF rollouts up to 7x compared to pure Pytorch implementation github.com/NVIDIA/NeMo-Align…

407

Oleksii Kuchaiev · Jan 5, 2022 · 8:29 PM UTC

Oleksii Kuchaiev

@kuchaev

5 Jan 2022

Replying to @sergeykarayev

We do github.com/NVIDIA/NeMo/tree/… (much more to come in the next months)

Oleksii Kuchaiev · Nov 19, 2023 · 7:57 PM UTC

Oleksii Kuchaiev

@kuchaev

19 Nov 2023

Replying to @alexgraveley

yes, currently very limited on human preference data. You might want to add new dataset we've published this week huggingface.co/datasets/nvid… which can be used with DPO. Btw, in most of our experiments SFT < DPO < SteerLM <= PPO. So while simple, DPO lags behind PPO and SteerLM.

1,294

Oleksii Kuchaiev · Dec 28, 2023 · 12:56 PM UTC

Oleksii Kuchaiev

@kuchaev

28 Dec 2023

I was taught that trees aren't supposed to have any cycles!

300