elvis · Jun 27, 2026 · 2:59 PM UTC

elvis

Pinned Tweet

elvis

@omarsar0

15h

x.com/i/article/206982584772…

Building Agents with Vercel's Eve Framework

Vercel recently shipped Eve, an open-source framework for building, running, and scaling agents. The core idea is that you stop hand-rolling the same agent plumbing every time, and start treating an

108

39,138

elvis · Feb 18, 2025 · 4:23 AM UTC

elvis

@omarsar0

18 Feb 2025

BREAKING: xAI announces Grok 3 Here is everything you need to know:

351

7,386

1,898,358

elvis · Nov 11, 2024 · 3:07 PM UTC

elvis

@omarsar0

11 Nov 2024

We live in incredible times.

964

6,890

411,669

elvis · May 17, 2025 · 7:07 PM UTC

elvis

@omarsar0

17 May 2025

AI Agents vs. Agentic AI Interesting paper summarizing distinctions between AI Agents and Agentic AI. It also talks about the key ideas, solutions, and the future. Here are my notes:

225

1,039

5,569

753,154

elvis · Jul 10, 2025 · 4:15 AM UTC

elvis

@omarsar0

10 Jul 2025

BREAKING: xAI announces Grok 4 "It can reason at a superhuman level!" Here is everything you need to know:

111

352

5,624

1,323,999

elvis · May 25, 2023 · 1:57 PM UTC

elvis

@omarsar0

25 May 2023

This maths book is trending on Hacker News! I took a quick look and realized how great of a book this is to learn how to think mathematically. It's 700 pages long and very approachable compared to other maths books. math.cmu.edu/~jmackey/151_12…

990

4,977

736,944

elvis · Dec 19, 2021 · 12:29 PM UTC

elvis

@omarsar0

19 Dec 2021

Just being honest: when looking for skilled machine learning and NLP engineers, I'm not looking at CVs anymore. Now I directly look at blogs, GitHub repos, videos, Twitter, etc. Having a CV is fine... but don't forget to document along the way (in detail) what you've built.

119

370

4,674

elvis · Jan 17, 2025 · 4:02 PM UTC

elvis

@omarsar0

17 Jan 2025

Foundations of LLMs This amazing new LLM book just dropped on arXiv. 200+ pages! It covers areas such as pre-training, prompting, and alignment methods. It looks like a great intro to LLMs for devs and researchers.

807

4,699

401,654

elvis · Jun 14, 2025 · 5:36 PM UTC

elvis

@omarsar0

14 Jun 2025

Anthropic is killing it with these technical posts. If you're an AI dev, stop what you are doing and go read this. It shows, in great detail, how to implement an effective multi-agent research system. Pay attention to these key parts:

433

4,522

563,112

elvis · Jun 7, 2025 · 12:54 PM UTC

elvis

@omarsar0

7 Jun 2025

The Illusion of Thinking in LLMs Apple researchers discuss the strengths and limitations of reasoning models. Apparently, reasoning models "collapse" beyond certain task complexities. Lots of important insights on this one. (bookmark it!) Here are my notes:

136

612

4,442

955,775

elvis · May 14, 2025 · 8:47 PM UTC

elvis

@omarsar0

14 May 2025

LLMs Get Lost in Multi-turn Conversation The cat is out of the bag. Pay attention, devs. This is one of the most common issues when building with LLMs today. Glad there is now paper to share insights. Here are my notes:

611

4,139

754,732

elvis · Sep 30, 2025 · 7:02 PM UTC

elvis

@omarsar0

30 Sep 2025

As usual, Anthropic just published another banger. This one is on context engineering. Great section on how it is different from prompt engineering. A must-read for AI devs.

440

3,959

367,758

elvis · Mar 31, 2023 · 1:00 PM UTC

elvis

@omarsar0

31 Mar 2023

BloombergGPT is a new LLM for finance. It's a 50 billion parameter language model trained on financial data. Claims the largest domain-specific dataset yet with 363 billion tokens... further augmented with 345 billion tokens from general purpose datasets. Outperforms existing models on financial tasks while not sacrificing performance on general LLM benchmarks. arxiv.org/abs/2303.17564v1

673

3,493

1,012,446

elvis · Mar 11, 2025 · 6:40 PM UTC

elvis

@omarsar0

11 Mar 2025

NEW: OpenAI announces new tools for building agents. Here is everything you need to know:

280

3,321

781,294

elvis · May 30, 2025 · 9:20 PM UTC

elvis

@omarsar0

30 May 2025

YC on the key prompting techniques used by the best AI startups:

300

3,297

658,759

elvis · Feb 13, 2022 · 6:02 PM UTC

elvis

@omarsar0

13 Feb 2022

The past month I've been writing detailed notes for the first 15 lectures of Stanford's NLP with Deep Learning. Notes contain code, equations, practical tips, references, etc. As I tidy the notes, I need to figure out how to best publish them. Here are the topics covered so far:

556

3,113

elvis · Oct 6, 2023 · 1:21 PM UTC

elvis

@omarsar0

6 Oct 2023

How Transformers Work This is probably one of the most beautiful visualizations of how today's LLMs work. ig.ft.com/generative-ai/

721

3,132

743,601

elvis · Sep 6, 2025 · 9:39 PM UTC

elvis

@omarsar0

6 Sep 2025

Everyone is talking about this new OpenAI paper. It's about why LLMs hallucinate. You might want to bookmark this one. Let's break down the technical details:

106

456

3,196

454,529

elvis · Dec 25, 2022 · 6:25 PM UTC

elvis

@omarsar0

25 Dec 2022

2022: A Year in Review (ML Papers Edition) In this thread, let's take a look at some of the top trending ML papers of 2022 ↓

626

3,089

704,925

elvis · Apr 5, 2025 · 7:16 PM UTC

elvis

@omarsar0

5 Apr 2025

Llama 4 is here! - Llama 4 Scout & Maverick are up for download - Llama 4 Behemoth (preview) - Advanced problem solving & multilingual - Support long context up to 10M tokens - Great for multimodal apps & agents - Image grounding - Top performance at the lowest cost - Can be served within $0.19-$0.49/M tokens

344

3,154

407,077

elvis · Mar 3, 2025 · 2:44 PM UTC

elvis

@omarsar0

3 Mar 2025

A Deep Dive into Reasoning LLMs This is a really nice summary of the progress made in post-training and reasoning LLMs. Highly recommend this one!

558

2,998

245,047

elvis · Apr 16, 2025 · 1:54 PM UTC

elvis

@omarsar0

16 Apr 2025

Exactly how I like learning about this stuff. MCMC is not a difficult concept to understand if you have the right person explain it to you.

256

2,929

258,849

elvis · Nov 1, 2022 · 1:12 PM UTC

elvis

@omarsar0

1 Nov 2022

All-in-one book that covers most of the maths you will need for machine learning. Free PDF here: cis.upenn.edu/~jean/math-dee…

580

2,760

elvis · Mar 28, 2023 · 11:28 PM UTC

elvis

@omarsar0

28 Mar 2023

New language model just dropped! GPT4All - a 7B parameter model (based on LLaMA) trained on a massive collection of clean assistant data including code, stories, and dialogue. Also releases 800K data samples, data curation procedures, training code, and model weights to promote open research. A quantized 4-bit version of the model is released that can run on CPU. repo: github.com/nomic-ai/gpt4all

502

2,802

719,580

elvis · Apr 23, 2025 · 1:52 PM UTC

elvis

@omarsar0

23 Apr 2025

Here is a new open-source IDE to help you build multi-agent systems. It's like Cursor but specifically for building multi-agent workflows. It's powered by OpenAI Agents SDK, connects MCP servers, and can integrate into your apps using HTTP or the SDK.

182

506

72,239

elvis · Apr 9, 2025 · 2:30 PM UTC

elvis

@omarsar0

9 Apr 2025

NEW: Google announces Agent2Agent Agent2Agent (A2A) is a new open protocol that lets AI agents securely collaborate across ecosystems regardless of framework or vendor. Here is all you need to know:

471

2,801

337,114

elvis · Apr 17, 2022 · 2:40 PM UTC

elvis

@omarsar0

17 Apr 2022

ML YouTube Courses (5.1K⭐️) I've added more info and categories to the repo, so it's much easier to find relevant courses. github.com/dair-ai/ML-YouTub…

647

2,689

elvis · Jun 6, 2025 · 1:47 PM UTC

elvis

@omarsar0

6 Jun 2025

Top 50 LLM Interview Questions. Looks like a great resource to learn LLM basics:

371

2,666

349,927

elvis · May 7, 2022 · 3:49 PM UTC

elvis

@omarsar0

7 May 2022

This repo contains all my notes for the "Introduction to Deep Learning" course from MIT. The notes are great for studying fundamentals and new topics in ML. I put them in Notion so you can easily extend them. More exciting ML content is on its way! github.com/dair-ai/ML-Course…

699

2,607

elvis · Oct 17, 2022 · 2:06 PM UTC

elvis

@omarsar0

17 Oct 2022

These machine learning cheatsheets contain some of the best and well-organized ML content I've come across. Sometimes, it's just good to understand the concept at a high level and it's context before going deep. This resource helps with that. stanford.edu/~shervine/teach…

550

2,596

elvis · Jul 15, 2022 · 2:26 PM UTC

elvis

@omarsar0

15 Jul 2022

Machine Learning YouTube Courses ( 7K⭐️ ) A few new entries have made into the collection. I'm glad to hear that a lot of ML students found this repo helpful to discover new courses. github.com/dair-ai/ML-YouTub…

635

2,561

elvis · Sep 6, 2024 · 7:49 PM UTC

elvis

@omarsar0

6 Sep 2024

Anthropic's recent AI Prompt Engineering deep dive is a must watch! Here's Claude's summary of the mentioned prompting techniques and tips:

338

2,656

394,906

elvis · Jun 12, 2023 · 12:59 AM UTC

elvis

@omarsar0

12 Jun 2023

FinGPT: Open-Source Financial LLMs FinGPT is an open-source LLM for the finance sector. It takes a data-centric approach, providing researchers & practitioners with accessible resources to develop FinLLMs. paper: arxiv.org/abs/2306.06031 code: github.com/AI4Finance-Founda…

570

2,594

572,227

elvis · Feb 25, 2022 · 2:59 PM UTC

elvis

@omarsar0

25 Feb 2022

I've been writing a special notebook to help get comfortable with mathematics for machine learning. It will contain popular equations paired with code and explanations. The idea is to start with a small collection of 100 (beginner to advanced). Would you be interested in this?

148

239

2,565

elvis · Apr 9, 2025 · 4:18 PM UTC

elvis

@omarsar0

9 Apr 2025

NEW: Google presents Agent Development Kit (ADK) Features: - code-first - multi-agents - rich tool ecosystem - flexible orchestration - integrated dev xp - development-ready - streaming - state, memory, artifacts - extensibility > pip install google-adk

415

2,628

271,860

elvis · Apr 4, 2023 · 5:08 PM UTC

elvis

@omarsar0

4 Apr 2023

Prompt Engineering Guide (20K⭐️) We started with basic prompt examples and have expanded to a comprehensive prompt engineering guide used by thousands of AI developers and researchers working with LLMs. - Now over 200K+ learners - Chinese & Japanese translations are now available - GPT-4 & ChatGPT guides and notebooks - Collection of all the latest tools and papers on prompt engineering - Added LLM collection - Papers explanations in progress ... and much more We aim to build the ultimate resource to learn how to work and build with LLMs. A lot more to come. Support and contributions are welcome! web: promptingguide.ai/ repo: github.com/dair-ai/Prompt-En…

510

2,551

617,305

elvis · Apr 4, 2023 · 12:03 AM UTC

elvis

@omarsar0

4 Apr 2023

New chatbot just dropped! Vicuna-13B - an open-source chatbot trained by fine-tuning LLaMA on ~70K user-shared ChatGPT conversations. Claims to achieve "more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LLaMA and Stanford Alpaca in more than 90%* of cases". Seems possible to run it on your own machine with a single GPU. repo: github.com/lm-sys/FastChat blog: vicuna.lmsys.org/ demo: chat.lmsys.org/

446

2,504

634,286

elvis · Jan 24, 2023 · 2:08 PM UTC

elvis

@omarsar0

24 Jan 2023

Understanding Deep Learning If you are looking for a comprehensive deep learning book with some of the more recent trends ( transformers, diffusion models, GNNs,...), this looks like a great option. udlbook.github.io/udlbook/

581

2,472

283,228

elvis · Mar 1, 2021 · 7:12 PM UTC

elvis

@omarsar0

1 Mar 2021

deep learning activation functions made cool and cute teddit.net/r/learnmachinelea…

547

2,441

elvis · Sep 30, 2025 · 10:21 PM UTC

elvis

@omarsar0

30 Sep 2025

We are living in the most insane timeline. I just asked Claude Code (with Claude Sonnet 4.5) to develop an MCP Server (end-to-end) that allows me to programatically create n8n workflows from within Claude Code itself. Took about 10 mins!

206

2,526

221,888

elvis · Mar 30, 2023 · 2:09 PM UTC

elvis

@omarsar0

30 Mar 2023

As an ML Engineer, this is one of the most useful applications of GPT-4 I've seen. Chat Explore is a powerful AI-powered data exploration tool. Here’s why I am so impressed:

341

2,418

716,668

elvis · Feb 27, 2025 · 3:35 PM UTC

elvis

@omarsar0

27 Feb 2025

Say goodbye to Chain-of-Thought. Say hello to Chain-of-Draft. To address the issue of latency in reasoning LLMs, this work introduces Chain-of-Draft (CoD). Read on for more:

365

2,435

278,486

elvis · Jan 31, 2025 · 10:44 PM UTC

elvis

@omarsar0

31 Jan 2025

o3-mini-high (left) vs. deepseek-r1 (right) results from the first try deepseek-r1 is cracked... wtf!

102

170

2,368

719,806

elvis · Nov 9, 2021 · 12:25 PM UTC

elvis

@omarsar0

9 Nov 2021

🎓 ML YouTube Courses 🎓 In case you missed it, I maintain a highly-curated collection of some of the best and latest machine learning courses available on YouTube. So much good free content to get started with or to catch up on. github.com/dair-ai/ML-YouTub…

597

2,353

elvis · Aug 21, 2025 · 9:26 PM UTC

elvis

@omarsar0

21 Aug 2025

Anthropic continues to crush it with these guides. This is a good example of what context engineering involves.

170

2,436

404,036

elvis · Apr 27, 2025 · 5:18 PM UTC

elvis

@omarsar0

27 Apr 2025

265 pages of everything you need to know about building AI agents. 5 things that stood out to me about this report:

414

2,362

281,147

elvis · Dec 27, 2024 · 6:23 PM UTC

elvis

@omarsar0

27 Dec 2024

Nice list by Google! It consists of 321 real-world gen AI use cases from the world's leading organizations. Great for learning how others are finding success with gen AI and AI agents.

274

2,279

200,256

elvis · Jan 23, 2025 · 6:18 PM UTC

elvis

@omarsar0

23 Jan 2025

OpenAI Introduces Operator & Agents! Here is everything you need to know:

209

2,241

488,986

elvis · Mar 15, 2023 · 1:33 AM UTC

elvis

@omarsar0

15 Mar 2023

Lots of tweets about GPT-4 in the last 8 hours. Here is a thread highlighting some of the interesting examples, tricks, and discussions I've come across ↓

417

2,205

654,344

elvis · Mar 27, 2022 · 2:02 PM UTC

elvis

@omarsar0

27 Mar 2022

I'm creating a new repo organizing all my machine learning & NLP PyTorch notebooks. Out of curiosity, would you be interested in this?

291

2,145

elvis · Feb 28, 2023 · 2:21 AM UTC

elvis

@omarsar0

28 Feb 2023

Here we go! Microsoft introduces a multimodal large language model called Kosmos-1. Achieves great performance on language understanding, OCR-free NLP, perception-language tasks, visual QA, and more.

463

2,054

410,298

elvis · Jul 11, 2022 · 2:15 PM UTC

elvis

@omarsar0

11 Jul 2022

Wow! This is exciting! First ever course on Transformers by Stanford. Really looking forward to the release of all lectures. website: web.stanford.edu/class/cs25/ youtube: piped.video/playlist?list=PL…

384

2,056

elvis · Jan 6, 2025 · 7:35 PM UTC

elvis

@omarsar0

6 Jan 2025

Google recently published this great whitepaper on Agents. 2025 is going to be a huge year for AI Agents. Here's what's included: - Introduction to AI Agents - The role of tools in Agents - Enhancing model performance with targeted learning - Quick start to Agents with LangChain - Production applications with Vertex AI Agents Great place to start learning about AI Agents.

363

2,097

261,635

elvis · Jan 22, 2024 · 3:36 PM UTC

elvis

@omarsar0

22 Jan 2024

🎓LLM Course This is such a beautiful and comprehensive resource on LLMs. It includes notebooks, key references, and roadmaps. There is something to learn for everyone. For students, researchers, and practitioners. The Prompt Engineering Guide is also referenced, which is cool to see. One observation as I was reviewing the references is how much hard work the ML community dedicates toward open and high-quality education. This resource does a great job of organizing all those incredible LLM educational resources that exist out there. One topic I would add is LLMOps. But to be fair, the majority of the topics are roughly covered in the LLM Engineer Roadmap. Highly recommended! And last but not least, many thanks to @maximelabonne for releasing this excellent resource. 👏

456

2,034

173,323

elvis · Feb 19, 2025 · 2:43 PM UTC

elvis

@omarsar0

19 Feb 2025

NEW: Google introduces AI co-scientist. It's a multi-agent AI system built with Gemini 2.0 to help accelerate scientific breakthroughs. 2025 is truly the year of multi-agents! Let's break it down:

109

375

2,055

211,574

elvis · Feb 27, 2023 · 2:19 PM UTC

elvis

@omarsar0

27 Feb 2023

ChatLLaMA - an open-source implementation of LLaMA based on RLHF. Claims a 15x faster training process than ChatGPT. It allows users to fine-tune personalized ChatLLaMA assistants. github.com/nebuly-ai/nebullv…

427

2,019

347,784

elvis · Jul 5, 2025 · 6:33 PM UTC

elvis

@omarsar0

5 Jul 2025

Context Engineering Guide I'm writing a detailed guide on context engineering for AI devs. v1 is out now! (bookmark it) I use a concrete deep research multi-agent example to show what context engineering involves.

294

2,033

287,730

elvis · Aug 2, 2025 · 9:07 PM UTC

elvis

@omarsar0

2 Aug 2025

Hierarchical Reasoning Model This is one of the most interesting ideas on reasoning I've read in the past couple of months. It uses a recurrent architecture for impressive hierarchical reasoning. Here are my notes:

278

2,070

258,762

elvis · May 23, 2025 · 1:04 PM UTC

elvis

@omarsar0

23 May 2025

Microsoft releases NLWeb NLWeb uses MCP to make it simple to interact with websites in a standardized way. Devs can now convert any website into an AI app. MCP is to NLWeb what HTTP is to HTML. This went largely unnoticed this week, but it looks like a big deal.

318

2,060

282,761

elvis · Jan 29, 2023 · 4:30 PM UTC

elvis

@omarsar0

29 Jan 2023

Machine Learning Notes I've been writing notes introducing some of the most important topics in AI today. This thread lists a few notes I've published so far:

420

1,956

405,279

elvis · Jan 7, 2025 · 8:03 PM UTC

elvis

@omarsar0

7 Jan 2025

Don't do RAG Proposes cache-augmented generation (CAG) to eliminate retrieval latency and minimize retrieval errors. What is CAG? CAG aims to leverage the capabilities of long-context LLMs by preloading the LLM with all relevant docs in advance and precomputing the key-value (KV) cache. The preloaded context helps the model to provide contextually accurate answers without the need for additional retrieval during runtime. When to apply CAG? It's a useful alternative to RAG for cases where the documents/knowledge for retrieval are of limited, manageable size. My thoughts: As LLMs advance in capabilities, I suspect that what we know as RAG today could change significantly either architecturally or how it's optimized. CAG is one in a growing list of developments and new ideas that have emerged recently to address limitations like poor retrieval relevancy and latency. There could also be hybrid methods that combine preloading with selective retrieval. Don't sleep on long-context LLMs. They are here to stay.

294

1,969

169,899

elvis · Jan 1, 2021 · 2:57 PM UTC

elvis

@omarsar0

1 Jan 2021

📘 Probabilistic Machine Learning: An Introduction I have been looking for a book like this. Kevin Murphy published the 2021 edition of the Probabilistic Machine Learning e-textbook. Love the emphasis on probability and math. It includes code examples. probml.github.io/pml-book/bo…

461

1,907

elvis · Aug 9, 2024 · 7:05 PM UTC

elvis

@omarsar0

9 Aug 2024

Transformer Explainer Really cool interactive tool to learn about the inner workings of a Transformer model. Apparently, it runs a GPT-2 instance locally in the user's browser and allows you to experiment with your own inputs. This is a nice tool to learn more about the different components inside the Transformer and the transformations that occur. Tool: poloclub.github.io/transform…

459

1,921

121,920

elvis · Apr 15, 2023 · 8:41 PM UTC

elvis

@omarsar0

15 Apr 2023

OpenAssistant is officially released! OpenAssistant is an open-source chat model. The release includes models, datasets, and a chat interface. The dataset consists of a ~161K human-generated, human-annotated assistant-style conversation corpus, including 35 different languages and annotated with ~461K quality ratings. This dataset release is huge! There are different models available including LLaMA-based and Pythia-based ones. We have seen many chat models released in the past few weeks but this one is probably a lot more powerful in terms of conversational capabilities. Will be testing it out in the coming days. web: open-assistant.io/chat dataset: huggingface.co/datasets/Open… models: huggingface.co/OpenAssistant

409

1,885

436,918

elvis · Jan 14, 2023 · 2:21 PM UTC

elvis

@omarsar0

14 Jan 2023

🐙ML Papers Explained An awesome new project with explanations of key deep learning concepts. (by @RitvikRastogi19 on @dair_ai) github.com/dair-ai/ML-Papers…

463

1,826

167,017

elvis · Dec 21, 2024 · 12:21 PM UTC

elvis

@omarsar0

21 Dec 2024

The hype around o3 is out of control. It’s not AGI, it’s not the singularity, and you definitely don’t have to change your worldview. In fact, the public doesn’t even have access to the models so how can anyone claim any of the above. I appreciate how the OpenAI researchers presented o3. I encourage folks to checkout the original presentation on YouTube. Don’t fall for all the hype threads you see here on X. OpenAI made it clear that there lots of things to improve on. It’s exciting yes but the headlines are misleading and benchmark results don’t really say much these days. Hoping these words balance your timeline a bit. Share if you think it helps.

145

176

1,809

227,229

elvis · Jul 29, 2023 · 8:26 PM UTC

elvis

@omarsar0

29 Jul 2023

From word vectors to Reinforcement Learning from Human Feedback... Stanford's "Natural Language Processing with Deep Learning" course is one of the most relevant and best AI/ML courses today. It's just amazing how much knowledge and content this course pushes out every year. It's hands down one of my go-to resources to catch up on everything to do with NLP every year. It contains notes, suggested readings, slides, tips, exercises, and so on. I remember watching the 2017 lectures and just falling in love with the course. I have studied the course material since then and it has tremendously helped me to keep up with things on NLP. It can feel like an advanced course for people that are just beginning but it's still an exceptional reference to keep up with research and topics. (links in the replies)

345

1,839

411,951

elvis · Nov 9, 2025 · 3:05 PM UTC

elvis

@omarsar0

9 Nov 2025

Kimi K2 Thinking is a bigger deal than I thought! I just ran a quick eval on a deep agent I built for customer support. It's on par with GPT-5; no other LLM has reached this level of agentic, orchestration, and reasoning capabilities. Huge for agentic and reasoning tasks.

183

1,872

229,726

elvis · Jul 1, 2025 · 1:23 PM UTC

elvis

@omarsar0

1 Jul 2025

Small Language Models are the Future of Agentic AI Lots to gain from building agentic systems with small language models. Capabilities are increasing rapidly! AI devs should be exploring SLMs. Here are my notes:

303

1,866

268,580

elvis · Nov 9, 2024 · 2:16 PM UTC

elvis

@omarsar0

9 Nov 2024

My AI usage these days: - claude-3.5-sonnet for most creative and writing tasks - gemini-1.5-pro for video-related tasks - chatgpt for image analysis and web search - gpt-4o-mini and gemini-flash for agentic stuff - o1-mini for reasoning and knowledge-intensive tasks - llama-3.1 for local LLM usage - midjourney for image generation - runway for video generation - elevenlabs for speech-related stuff I'm intentionally experimenting with various models. It's often a combination of these that leads to the best performance. Where things stand, I believe it's a bad idea to overcommit to one model series. How's your usage looking?

178

1,819

227,116

elvis · Jan 31, 2025 · 5:22 PM UTC

elvis

@omarsar0

31 Jan 2025

Stanford CS234: Reinforcement Learning These lectures look like a nice introduction to reinforcement learning (RL). After the impact of RL in recent models like DeepSeek-R1 and o1, it's worth learning about RL today.

262

1,774

122,984

elvis · Feb 6, 2022 · 10:51 AM UTC

elvis

@omarsar0

6 Feb 2022

Graph neural networks (GNNs) are rapidly advancing progress in ML for complex graph data applications. Let's have a look at some resources to help you learn and keep up-to-date with GNNs ↓

379

1,733

elvis · Aug 12, 2023 · 2:24 PM UTC

elvis

@omarsar0

12 Aug 2023

Just came across this on arXiv. An awesome book on graph theory. If you are in computer science, graph theory is one of the most useful topics to study. 422 pages. Publicly available here: arxiv.org/abs/2308.04512

366

1,720

185,782

elvis · Feb 1, 2022 · 1:36 PM UTC

elvis

@omarsar0

1 Feb 2022

Mathematics is worth every minute you spend learning it.

212

1,688

elvis · Nov 15, 2022 · 4:30 PM UTC

elvis

@omarsar0

15 Nov 2022

🎉 Proud and excited to announce Galactica - a large language model for science. We trained a 120B parameter language model on a massive scientific corpus that performs different tasks such as solving math problems and summarizing academic literature.

Papers with Code

@paperswithcode

15 Nov 2022

🪐 Introducing Galactica. A large language model for science. Can summarize academic literature, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more. Explore and get weights: galactica.org

310

1,697

elvis · Sep 8, 2023 · 8:45 PM UTC

elvis

@omarsar0

8 Sep 2023

LLMs as Optimizers This is a really neat idea. This new paper from Google DeepMind proposes an approach where the optimization problem is described in natural language. An LLM is then instructed to iteratively generate new solutions based on the defined problem and previously found solutions. It was first tested on linear regression and the traveling salesman problem. Leveraging LLMs with simple prompting match or surpass hand-designed heuristic algorithms. This shows good potential for using LLMs as optimizers. The idea is then applied to prompt optimization that aims to maximize task accuracy on different tasks like math word problem-solving. The first piece of the proposed meta-prompt takes in previously generated prompts along with corresponding training accuracies. The second piece includes the optimization problem description with samples obtained from a training set representing the task. At each optimization step, the goal is to generate new prompts that increase test accuracy based on the trajectory of previously generated prompts. The optimized prompts outperform human-designed prompts on GSM8K and Big-Bench Hard, sometimes by over 50%! For math word problem solving, one of the most effective instructions found begins with "Take a deep breath and work on this problem step-by-step". arxiv.org/abs/2309.03409

373

1,723

300,469

elvis · Jul 17, 2025 · 9:19 PM UTC

elvis

@omarsar0

17 Jul 2025

Agent Leaderboard v2 is here! > GPT-4.1 leads > Gemini-2.5-flash excels at tool selection > Kimi K2 is the top open-source model > Grok 4 falls short > Reasoning models lag behind > No single model dominates all domains More below:

201

1,754

274,917

elvis · Jul 19, 2024 · 11:29 PM UTC

elvis

@omarsar0

19 Jul 2024

GPT-4o mini is 60% cheaper than GPT-3.5 Turbo. That's insane! The model is priced at $0.15 per million input tokens and $0.60 per million output tokens (~2500 pages book). For comparison, GPT-3.5-turbo-0301 was $2.00 per 1M tokens roughly over a year ago. On blended pricing (80% input tokens and 20% output tokens) GPT-4o has reduced costs to $0.24 per 1M tokens. Based on a few tests, the model seems to be good at structuring information, long-context understanding, function calling, and has great vision capabilities. I have done an extensive overview along with some test cases of GPT-4o mini here: piped.video/FNa1-OKN3yU?si=GmLc… As stated in the announcement, the cost per token of GPT-4o mini has dropped by 99% since text-davinci-003. If that's a trend, where are we going to be in a couple of months?

262

33,500

elvis · Nov 8, 2025 · 2:38 PM UTC

elvis

@omarsar0

8 Nov 2025

The most effective AI Agents are built on these core ideas. It's what powers Claude Code. It's referred to as the Claude Agent SDK Loop, which is an agent framework to build all kinds of AI agents. (bookmark it) The loop involves three steps: Gathering Context: Use subagents (parallelize them for task efficiency when possible), compact/maintain context, and leverage agentic/semantic search for retrieving relevant context for the AI agent. Hybrid search approaches work really well for domains like agentic coding. Taking Action: Leverage tools, prebuilt MCP servers, bash/scripts (Skills have made it a lot easier), and generate code to take action and retrieve important feedback/context for the AI agent. Turns out you can also enhance MCP and token usage through code execution and routing, similar to how LLM routing increases efficiency in AI Agents. Verifying Output: You can define rules to verify outputs, enable visual feedback (this becomes increasingly important in multimodal problems), and consider LLM-as-a-Judge to verify quality based on fuzzy rules. Some problems will require visual cues and other forms of input to perform well. Don't overcomplicate the workflow (eg, use computer-using agents when a simple Skill with clever scripts will do). This is a clean, flexible, and solid framework for how to build and work with AI agents in all kinds of domains.

253

1,823

176,069

elvis · Apr 30, 2022 · 1:23 PM UTC

elvis

@omarsar0

30 Apr 2022

🎓 Mathematics for Deep Learning This reference contains some mathematical concepts to help build a better understanding of deep learning: d2l.ai/chapter_appendix-math…

396

1,647

elvis · Nov 1, 2023 · 3:10 PM UTC

elvis

@omarsar0

1 Nov 2023

🎓Generative AI for Beginners Another great effort by Microsoft on AI education. This one contains a series of lessons on generative AI, including an introduction to LLMs, prompt engineering fundamentals, building text generation/chat applications, and more. github.com/microsoft/generat…

362

1,675

284,205

elvis · Jun 5, 2023 · 6:21 PM UTC

elvis

@omarsar0

5 Jun 2023

Generative AI Learning Path This is a great new learning resource on Generative AI by Google! Accessible for FREE! cloudskillsboost.google/path…

398

1,660

238,027

elvis · Sep 30, 2021 · 5:09 PM UTC

elvis

@omarsar0

30 Sep 2021

🎓 The Art of Linear Algebra An impressive set of graphic notes on the popular book "Linear Algebra for Everyone". A great resource for what's an important subject in machine learning and computer science in general. repo: github.com/kenjihiranabe/The… pdf: github.com/kenjihiranabe/The…

367

1,655

elvis · Feb 24, 2023 · 5:28 PM UTC

elvis

@omarsar0

24 Feb 2023

JUST IN: Meta AI introduces LLaMA, a 65B parameter LLM. LLaMa only relies on publicly available data and outperforms GPT-3 on most benchmarks despite being 10x smaller.

312

1,620

259,687

elvis · Dec 23, 2021 · 3:45 PM UTC

elvis

@omarsar0

23 Dec 2021

YouTube is easily becoming one of the best free "universities" for all things machine learning, engineering, math, and science.

178

1,575

elvis · Dec 3, 2024 · 3:06 PM UTC

elvis

@omarsar0

3 Dec 2024

MegaParse is an open-source tool for parsing various types of documents for LLM ingestion. Supports text, PDF, PowerPoint, excel, csv, and Word documents. It can convert these into a format ideal for LLMs. It can parse content of different types such as tables, TOC, headers, footers, images, etc. I am also building a similar tool and I think the most important feature at the moment is the ability to customize the format of the transformed data as different LLMs prefer different formats.

213

1,633

120,673

elvis · Nov 22, 2024 · 7:07 PM UTC

elvis

@omarsar0

22 Nov 2024

How I leverage AI today: - Claude Projects for summarization - CrewAI agents for orchestrating research agents - Flowise AI for private doc analysis (i.e., RAG) - Midjourney for image generation - NotebookLM for education-related tasks - ChatGPT Search for discovery - Grok for finding interesting AI papers - Google AI Studio for video transcription - Cursor for fast code prototyping - v0 for design work - Anthropic console prompt optimization and evaluation - Claude Artifacts for artifact generation like flow charts - ChatGPT-o1-preview/canvas for reviewing/refining writing This is a subset of my stack. When I feel performance deteriorates for any one tool or model, I switch to other alternatives. Bad idea to overcommit to one AI tool or product. I am also constantly experimenting with different models. My work varies between writing, research, coding, product, marketing, and business operations. These tools, and many others, are simplifying the way I do work. How about you?

190

1,620

211,878

elvis · Aug 13, 2023 · 4:50 PM UTC

elvis

@omarsar0

13 Aug 2023

🎓Stanford CS229: Machine Learning (Spring 2022) Really cool to see a new iteration of this course. It's a classic ML course from Stanford that has helped tons of students get started with machine learning. Covers foundational topics such as logistic regression, Naive Bayes, kernels, neural networks, bias-variance, regularization, k-means, expectation maximization, and more. YouTue Lectures: piped.video/playlist?list=PL…

348

1,573

190,869

elvis · Oct 9, 2023 · 2:28 PM UTC

elvis

@omarsar0

9 Oct 2023

Large Language Models (in 2023) An excellent summary of the research progress and developments in LLMs. I appreciate that @hwchung27 made this content publicly available. It's a great way to catch up on some important themes like scaling and optimizing LLMs. talk: piped.video/dbo3kNKPaUA?feature… slides: docs.google.com/presentation…

382

1,575

229,503

elvis · Feb 10, 2024 · 4:11 PM UTC

elvis

@omarsar0

10 Feb 2024

Stanford CS25 - Transformers United So much fun catching up with these Transformer lectures. There is a lot of content I'm already familiar with but I always love reviewing stuff to build on my understanding of complex concepts and learn new ones along the way. I find that in the field of LLMs, there are many different perspectives and interpretations so it's good to keep an open mind to different takes and explanations. This approach helps strengthen my intuition about LLMs. Pair it with a few coding sessions along the way and it's well worth every minute. At least that is how I've always made good use of these lectures. All the latest lectures are highly recommended.

313

1,589

139,162

elvis · May 20, 2024 · 4:51 PM UTC

elvis

@omarsar0

20 May 2024

Llama 3 From Scratch This project is really cool! It implements Llama 3 from scratch. The whole thing is explained, step-by-step, in the readme. I like the way it's broken down as it also serves as a good way to study the main components of an LLM. github.com/naklecha/llama3-f…

366

1,604

107,400

elvis · Dec 2, 2024 · 3:45 PM UTC

elvis

@omarsar0

2 Dec 2024

PydanticAI A new Python-based agent framework to build production-grade LLM-powered applications. - Built by the team behind Pydantic - Model-agnostic - Type-sage - Structured response validation with Pydantic - Streamed responses (including validation) with Pydantic - Tools for testing and eval-driven iterative development - Logfire integration for debugging and monitoring

244

1,585

122,929

elvis · Apr 22, 2023 · 6:07 PM UTC

elvis

@omarsar0

22 Apr 2023

Stanford CS330: Deep Multi-Task and Meta Learning Great new lectures to catch up on advanced topics in deep learning like meta learning and multi-task learning. Also includes other topics like generative models and few-shot learning. piped.video/playlist?list=PL…

328

1,545

173,364

elvis · Feb 25, 2024 · 3:41 PM UTC

elvis

@omarsar0

25 Feb 2024

GPT in 60 Lines of NumPy This looks like another fun tutorial on how to implement GPT from scratch with NumPy.

256

1,564

151,730

elvis · Aug 2, 2022 · 1:03 PM UTC

elvis

@omarsar0

2 Aug 2022

🎓 Probabilistic Machine Learning: Advanced Topics Got a chance to briefly check out the new ML book by @sirbayes. It's genuinely a one-of-a-kind resource for students looking to be well-versed in ML. 👏 probml.github.io/pml-book/bo…

304

1,533

elvis · Mar 28, 2023 · 1:24 AM UTC

elvis

@omarsar0

28 Mar 2023

ChatDoctor: A medical chat model fine-tuned on LLaMA using medical domain knowledge. Collects data on around 700 diseases and generated 5K doctor-patient conversations to finetune the LLM. paper: arxiv.org/abs/2303.14070 code: github.com/Kent0n-Li/ChatDoc…

374

1,559

229,905

elvis · Sep 7, 2025 · 6:17 PM UTC

elvis

@omarsar0

7 Sep 2025

Another impressive paper by Google DeepMind. It takes a closer look at the limits of embedding-based retrieval. If you work with vector embeddings, bookmark this one. Let's break down the technical details:

222

1,554

206,087

elvis · Jan 17, 2024 · 5:45 PM UTC

elvis

@omarsar0

17 Jan 2024

RAG vs. Fine-Tuning Cool report discussing the tradeoff between RAG and fine-tuning when using LLMs like Llama 2 and GPT-4. It performs a detailed analysis and highlights insights when applying the pipelines on an agricultural dataset. Here is a figure showing the pipeline used in this study: Here is a summary of the comparison between RAG and fine-tuning results: Findings: The authors observe that there is an "accuracy increase of over 6 p.p. when fine-tuning the model and this is cumulative with RAG, which increases accuracy by 5 p.p. further." They also "demonstrate that the fine-tuned model leverages information from across geographies to answer specific questions, increasing answer similarity from 47% to 72%." RAG is effective where data is contextually relevant such as interpretation of farm data. However, it might significantly increase the prompt size and become harder to steer. Fine-tuning, on the other hand, could be tuned for brevity and can incur less cost (i.e., necessitates minimal input token size) when dealing with large datasets. The challenge is the initial cost and effort required to fine-tune models on new data. Overall, the suitability of each approach depends on the specific application, the nature and size of the data, and available resources for model development. As suggested by many other reports, there is also the possibility of combining the two approaches. I also agree with the authors that it would be interesting to combine structured information from PDFs with images and captions to enable multi-modal fine-tuning opportunities.

340

1,531

177,575

elvis · Jul 18, 2025 · 4:12 PM UTC

elvis

@omarsar0

18 Jul 2025

A Survey of Context Engineering 160+ pages covering the most important research around context engineering for LLMs. This is a must-read! Here are my notes:

315

1,572

203,768

elvis · Oct 1, 2025 · 8:42 PM UTC

elvis

@omarsar0

1 Oct 2025

How do you build effective AI Agents? This is a problem I think deeply about with other AI devs and students. Simplicity works well here. I think we can all learn a lot from how Claude Code works. The Claude Agent SDK Loop generalizes the approach to build all kinds of AI agents. I wrote a few notes from Anthropic's recent guide. The loop involves three steps: Gathering Context: Use subagents (parallelize them for task efficiency), compact/maintain context, and leverage agentic/semantic search for retrieving relevant context for the AI agent. Taking Action: Leverage tools, prebuilt MCP servers, bash/scripts, and generate code to take action and retrieve important feedback/context for the AI agent. Verifying Output: You can define rules to verify outputs, enable visual feedback (this becomes increasingly important in multimodal problems), and consider LLM-as-a-Judge to verify quality based on fuzzy rules. I believe this is a really clean and solid framework for how to build and work with AI agents in all kinds of domains.

213

1,587

143,945

elvis · Apr 3, 2023 · 1:19 AM UTC

elvis

@omarsar0

3 Apr 2023

A Survey of LLMs A new 50 pages survey on large language models just dropped on arXiv. arxiv.org/abs/2303.18223

356

1,521

207,848