Sergey Karayev · Feb 25, 2025 · 12:59 AM UTC

Sergey Karayev

Sergey Karayev

@sergeykarayev

25 Feb 2025

Anthropic is elf-coded, OpenAI is orc-coded, xAI is dwarf-coded, and Google DeepMind is human-coded. This leaves an opportunity for a hobbit-coded research lab.

202

276

5,536

480,396

Sergey Karayev · Sep 12, 2022 · 5:30 PM UTC

Sergey Karayev

@sergeykarayev

12 Sep 2022

Here's a brief glimpse of our INCREDIBLE near future. GPT-3 armed with a Python interpreter can · do exact math · make API requests · answer in unprecedented ways Thanks to @goodside and @amasad for the idea and repl! Play with it: replit.com/@SergeyKarayev/gp…

634

3,844

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

AI research is converging on a major finding: language models are a great substrate for all AI applications. This feels like a HUGE deal. Some examples:

541

3,359

Sergey Karayev · Jun 22, 2023 · 6:36 AM UTC

Sergey Karayev

@sergeykarayev

22 Jun 2023

Guys I think I figured out wtf happened in 1971

180

2,807

1,393,963

Sergey Karayev · Oct 19, 2022 · 1:55 AM UTC

Sergey Karayev

@sergeykarayev

19 Oct 2022

The future: · Write emails with bullet points, which an AI assistant automatically expands into beautiful long text. · Read emails by having an AI assistant summarize long-ass text into bullet points...

230

2,435

Sergey Karayev · May 4, 2023 · 10:25 PM UTC

Sergey Karayev

@sergeykarayev

4 May 2023

Google has no moat. They don't have over 90% search traffic. They don't have everyone's emails and the most used email client. Their OS is not powering 70% of smartphones. They will never be able to deploy LLM features into these products -- instead, people will run OSS LLMs.

110

2,337

998,466

Sergey Karayev · Nov 17, 2023 · 8:52 PM UTC

Sergey Karayev

@sergeykarayev

17 Nov 2023

Okay so OpenAI board is · Ilya, got it, makes sense · Helen Toner, DC policy person, fine · Adam D'Angelo, CEO of Quora, okay I guess but why though? · Tasha McCauley, "tech entrepreneur" and funny enough also wife of Joseph Gordon-Levitt, how did this board come together?

1,584

461,929

Sergey Karayev · Jul 11, 2024 · 2:49 PM UTC

Sergey Karayev

@sergeykarayev

11 Jul 2024

I'm ready to pay much more than $20/month for a coding copilot that is 10x as good as GitHub Copilot or Cursor. I WANT to pay more. Load my entire repo into Gemini 1.5 context and cache it. Automatically review all my PRs. Charge me $200/month. Charge me $2000/month!

109

1,526

201,177

Sergey Karayev · Nov 18, 2023 · 7:14 PM UTC

Sergey Karayev

@sergeykarayev

18 Nov 2023

Got nerd-sniped by the OpenAI Board of Directors. Here's everyone who's ever been on it, their claim to fame, and why they left.

121

1,335

450,348

Sergey Karayev · Jul 28, 2025 · 5:22 PM UTC

Sergey Karayev

@sergeykarayev

28 Jul 2025

Did you know that Claude Code can use the browser to QA its own work? 1. Run `claude mcp add playwright -- npx -y @playwright/mcp@latest` 2. Tell Claude where your app is running, e.g localhost:8000 3. Now Claude can click and type to make sure its code is actually working!

1,237

107,457

Sergey Karayev · Sep 16, 2022 · 6:52 PM UTC

Sergey Karayev

@sergeykarayev

16 Sep 2022

Now that our GPT-3 can execute code on @Replit, let's teach it to: · Google stuff · Read web pages · ✨Ask GPT-3 questions✨ That's right -- we're going RECURSIVE.

148

1,189

Sergey Karayev · Oct 3, 2024 · 5:46 PM UTC

Sergey Karayev

@sergeykarayev

3 Oct 2024

GitHub Copilot was released three years ago. In these three years, GitHub still hasn't shipped automated PR description, review, multi-file editing, test gen, etc. Perhaps because Nat Friedman stopped being their CEO right after releasing Copilot? Who even is their CEO now?

1,099

159,393

Sergey Karayev · Mar 20, 2023 · 4:11 AM UTC

Sergey Karayev

@sergeykarayev

20 Mar 2023

Conclusion after GPT-4 hacking weekend: Even if there is ZERO further progress in LLM models, software engineering will still be revolutionized in the next couple of years, just through UX and non-ML innovations. Absolutely massive overhang.

1,060

256,548

Sergey Karayev · Jul 10, 2024 · 5:31 PM UTC

Sergey Karayev

@sergeykarayev

10 Jul 2024

What I admire about @karpathy is that he just keeps "doing things that don't scale". Label the entire ImageNet by yourself? Sure. Engineer petabyte-scale data engine for self-driving? Let's do it. Implement GPT from scratch? Easy. An inspiring attitude.

1,033

66,146

Sergey Karayev · Dec 6, 2024 · 4:51 PM UTC

Sergey Karayev

@sergeykarayev

6 Dec 2024

Imagine Claude, but: • Inside of your todo list • Much smarter, as it plans and reads stuff asynchronously • Searches both the web and your own email and calendar We built it and it's pretty awesome. Opening up to 50 more folks. Like this tweet and DM me if you want to try!

874

87,760

Sergey Karayev · Aug 5, 2022 · 5:22 PM UTC

Sergey Karayev

@sergeykarayev

5 Aug 2022

Made something I've always wanted to see: a comparison table of all cloud GPU providers! Filter by provider, architecture, exact GPU, etc. Sort by price, RAM, vCPUs, etc. Both on-demand and spot instance prices. fullstackdeeplearning.com/cl…

Cloud GPUs Comparison Table

Detailed comparison table of cloud GPU providers for deep learning.

fullstackdeeplearning.com

138

818

Sergey Karayev · Sep 26, 2021 · 4:05 PM UTC

Sergey Karayev

@sergeykarayev

26 Sep 2021

Request for startup: Amazon, but for getting rid of stuff. It’s super easy to get stuff into your home: just click Buy. It’s harder to get stuff out. Electronics should be recycled, valuable things should be sold, bulky things need transport. I’d pay to not worry about it.

745

Sergey Karayev · Mar 3, 2021 · 1:14 AM UTC

Sergey Karayev

@sergeykarayev

3 Mar 2021

Best meme format

699

Sergey Karayev · Oct 7, 2024 · 2:18 PM UTC

Sergey Karayev

@sergeykarayev

7 Oct 2024

Replying to @karpathy

We had a unique one but we fumbled her…

763

103,526

Sergey Karayev · Jul 14, 2025 · 9:30 PM UTC

Sergey Karayev

@sergeykarayev

14 Jul 2025

I wanted to better understand how Claude Code is wired under the hood, so I captured its API requests and pulled out the system prompt and tool definitions. Also posting the full thing as a gist below if you want to dig in!

748

85,532

Sergey Karayev · Apr 16, 2023 · 8:12 PM UTC

Sergey Karayev

@sergeykarayev

16 Apr 2023

Web LLM is insane. 1) Go download the latest Chrome beta, which shipped WebGPU support: google.com/chrome/beta/ 2) Now use a 7B-param LLM in your browser! mlc.ai/web-llm/ 3) Marvel at the "How" section on their GitHub: github.com/mlc-ai/web-llm#ho…

167

692

121,959

Sergey Karayev · Dec 31, 2022 · 1:57 AM UTC

Sergey Karayev

@sergeykarayev

31 Dec 2022

“Did GPT-3 write this?” is such a good insult

673

118,035

Sergey Karayev · Aug 14, 2023 · 7:43 PM UTC

Sergey Karayev

@sergeykarayev

14 Aug 2023

Guaranteed JSON output from any local LLM, with very low overhead! Check out the library and a brief description of the method below the fold. github.com/normal-computing/… "The basic idea is simple: regular expressions have an equivalent Deterministic-Finite Automaton (DFA) representation. We can transform this DFA into a generative model: in each state we get a list of symbols which correspond to completions that partially match the regular expression. We mask the other symbols in the logits returned by a large language model, sample a new symbol and move to the next state." - @remilouf and co at @normalcomputing

701

170,308

Sergey Karayev · Apr 12, 2024 · 6:37 PM UTC

Sergey Karayev

@sergeykarayev

12 Apr 2024

I want VSCode but using an infinite canvas instead of tabs. Does this exist?

679

291,106

Sergey Karayev · May 26, 2023 · 2:14 AM UTC

Sergey Karayev

@sergeykarayev

26 May 2023

A seriously baller demo: meerkat.wiki · Add a million PDFs to a DataFrame instantly · In-notebook UI to review them in various ways · In-notebook instant LLM training to "flash fill" a new column, with easy review

653

143,216

Sergey Karayev · Jul 12, 2023 · 4:55 PM UTC

Sergey Karayev

@sergeykarayev

12 Jul 2023

Broke: using OpenAI embeddings as-is. Bespoke: learning an embedding projection from human judgements. OpenAI explains that this will "better emphasize aspects of the text relevant to your use case. In binary classification use cases, we've seen error rates drop by ≤ 50%."

639

486,041

Sergey Karayev · Mar 13, 2023 · 3:55 PM UTC

Sergey Karayev

@sergeykarayev

13 Mar 2023

Remember these? Wondering if there is an equivalent adversarial attack on LLMs. (Simple prompt injections is not it — the attack needs to be invisible to a human observer.)

620

199,253

Sergey Karayev · Jun 22, 2023 · 7:31 AM UTC

Sergey Karayev

@sergeykarayev

22 Jun 2023

~1971: US fertility rate dips below replacement and stays there. (Source: wsj.com/articles/u-s-births-…)

549

109,045

Sergey Karayev · Dec 23, 2024 · 5:27 PM UTC

Sergey Karayev

@sergeykarayev

23 Dec 2024

Some things that don't make sense to me: • What made Sonnet 3.5 so much better than Sonnet 3? Is it "Golden Gate Claude" but for being smart and helpful? • Why did Anthropic treat 3.5 (new) as a minor update to 3.5 when in fact it's massively better? • Why is Haiku 3.5 still text-only? Is Claude 3-family not trained multi-modally from the start? • What exactly is the difference between o3 and o1? Did o1 stem from GPT-4 and o3 stem from GPT-5? • Why is Google still the only frontier API with >200K context? And how is it a full OOM ahead of the others?

585

67,539

Sergey Karayev · Apr 26, 2023 · 5:55 PM UTC

Sergey Karayev

@sergeykarayev

26 Apr 2023

A great article from @Replit on training their own LLMs from scratch: blog.replit.com/llm-training Quick thread of the takeaways:

Replit — How to train your own Large Language Models

Learn how Replit trains Large Language Models (LLMs) using Databricks, Hugging Face, and MosaicML Introduction Large Language Models, like OpenAI's GPT-4 or Google's PaLM, have taken the world of...

replit.com

125

549

96,802

Sergey Karayev · Mar 11, 2021 · 10:06 PM UTC

Sergey Karayev

@sergeykarayev

11 Mar 2021

Inspired by @karpathy

418

Sergey Karayev · Apr 12, 2024 · 7:17 PM UTC

Sergey Karayev

@sergeykarayev

12 Apr 2024

Found this from @anas_araid

anas @anas_araid

13 Mar 2023

imagine a figma-like infinite workspace in visual studio code. prototype built using react and @code's api extension.

415

30,403

Sergey Karayev · Feb 25, 2025 · 2:11 AM UTC

Sergey Karayev

@sergeykarayev

25 Feb 2025

By the way, every LLM disagrees with me. They all think that OpenAI are humans, Anthropic are Elves, Google DeepMind are dwarves, and xAI are orcs.

403

18,180

Sergey Karayev · Oct 25, 2022 · 11:57 PM UTC

Sergey Karayev

@sergeykarayev

25 Oct 2022

Cursed thought: what % of GPT-4 training data was generated by GPT-3?

344

Sergey Karayev · Apr 19, 2023 · 2:36 AM UTC

Sergey Karayev

@sergeykarayev

19 Apr 2023

Why does Nvidia still not have their own GPU cloud? Do they dislike money?

337

576,681

Sergey Karayev · Dec 10, 2023 · 7:12 PM UTC

Sergey Karayev

@sergeykarayev

10 Dec 2023

The LLM benchmark we need: ChatGPT-like website that always shows two responses, generated by any two of N different models (user can't see which). The user has to select the better response in order to keep using the chat (it's otherwise free). Leaderboard will be decisive.

335

57,528

Sergey Karayev · Apr 26, 2022 · 10:28 PM UTC

Sergey Karayev

@sergeykarayev

26 Apr 2022

What an elegant way to do object detection: given an image, simply output the sequence of bounding box coordinates and labels as text. Great work from @tingchenai, @srbhsxn, Lala Li, @fleet_dj @geoffreyhinton ai.googleblog.com/2022/04/pi…

336

Sergey Karayev · May 31, 2023 · 7:54 PM UTC

Sergey Karayev

@sergeykarayev

31 May 2023

Some candid insights from OpenAI here humanloop.com/blog/open_ai_t… · GPU shortage means no GPT-4 multimodality this year · Up to 1M token context window are plausible · ChatGPT plugins don't have PMF

OpenAI's plans according to Sam Altman

Last week I had the privilege to sit down with Sam Altman and 20 other developers to discuss OpenAI’s product plans. Sam was remarkably open. The discussion touched on practical developer issues as...

humanloop.com

317

93,892

Sergey Karayev · May 5, 2023 · 1:59 AM UTC

Sergey Karayev

@sergeykarayev

5 May 2023

Counterpoint

Yucheng Li @liyucheng_2

4 May 2023

Replying to @sergeykarayev

Kodak had no moat. They didn't have over 90% market share in film photography. They didn't have everyone's personal photographs and the most widely used film camera. They didn't invent digital camera.

294

97,520

Sergey Karayev · Mar 1, 2023 · 8:07 PM UTC

Sergey Karayev

@sergeykarayev

1 Mar 2023

Tried it out, and the new ChatGPT API is not only 10x cheaper but 10x faster, too. Absolutely insane.

294

55,307

Sergey Karayev · Jul 14, 2022 · 12:41 AM UTC

Sergey Karayev

@sergeykarayev

14 Jul 2022

How are you guys making slide presentations? Is there anything better than Keynote, Google Slides, Powerpoint? In particular, is there anything that would be amenable to "pull requests"?

279

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

Ask free-form questions and receive free-form answers about a video.

Andy Zeng

@andyzengineer

7 Apr 2022

With multiple foundation models “talking to each other”, we can combine commonsense across domains, to do multimodal tasks like zero-shot video Q&A or image captioning, no finetuning needed. Socratic Models: website + code: socraticmodels.github.io paper: arxiv.org/abs/2204.00598

255

Sergey Karayev · May 17, 2023 · 12:09 AM UTC

Sergey Karayev

@sergeykarayev

17 May 2023

This LLM guidance language from Microsoft is super interesting. Worth a read-through for sure: github.com/microsoft/guidanc…

253

56,640

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

But the vast majority of these large models are probably not dedicated to language either, only the data-interface layers are. This paper from @_kevinlu @adityagrover_ @pabbeel @IMordatch suggests that models learn general computation from language data. bair.berkeley.edu/blog/2021/…

Pretrained Transformers as Universal Computation Engines

The BAIR Blog

bair.berkeley.edu

235

Sergey Karayev · Jan 3, 2023 · 5:45 AM UTC

Sergey Karayev

@sergeykarayev

3 Jan 2023

I'm reading every week in 2023. Advice threads, GPT-3 demos, war assessments, shitposts, or anything people like a lot. I'll keep adjusting the list. Start on Monday, done by Sunday. Might make lowkey videos of takeaways. If you want to read along, the current list:

233

23,598

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

Does this resemble how human cognition happens? My understanding is that the vast majority of human intelligence is not intermediated by language: most processing happens unconsciously, and only the "tip of the iceberg" is in the form of language.

222

Sergey Karayev · Sep 14, 2022 · 1:16 AM UTC

Sergey Karayev

@sergeykarayev

14 Sep 2022

Pretty surprising that ~2 years after OpenAI published GPT-3 and ~1 year after it opened the API up to everyone, there's no real competitor to the davinci tier.

233

Sergey Karayev · Jun 1, 2023 · 4:16 PM UTC

Sergey Karayev

@sergeykarayev

1 Jun 2023

My dream LLM: - 100k token context - $0.00001 per token - very capable & polite - 2023 training data cutoff - rlly funny but a bit weird - rlly kind & is aligned to my values - not derived from LLaMA (self made) - good taste - good listener & planner - loves generating text a LOT

216

23,871

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

Receive illustrations from free-form descriptions (DALL-E is combines two different tricks, one of which is a model that embeds text and images into a common space).

Sam Altman

@sama

6 Apr 2022

DALL·E 2 is here! It can generate images from text, like "teddy bears working on new AI research on the moon in the 1980s". It's so fun, and sometimes beautiful. openai.com/dall-e-2/

207

Sergey Karayev · Jul 9, 2024 · 4:44 AM UTC

Sergey Karayev

@sergeykarayev

9 Jul 2024

Does LLM temperature affect its reasoning ability? This paper finds that it does not. arxiv.org/abs/2402.05201

207

69,858

Sergey Karayev · Sep 12, 2022 · 5:40 PM UTC

Sergey Karayev

@sergeykarayev

12 Sep 2022

Replying to @sergeykarayev @goodside @amasad

There's so much low-hanging fruit here it's simply insane. · Add first-class support for searching the web, parsing HTML · Add "state" to the prompt, allowing new answers to reference previous answers. · Make a Python library to provide uniform interface to a bunch of free APIs

204

Sergey Karayev · Dec 27, 2022 · 9:10 PM UTC

Sergey Karayev

@sergeykarayev

27 Dec 2022

Internet-based AGI is going to achieve its goals in the physical world simply by paying humans to do tasks. Same way corporations get things done. So for alignment purposes, human control over money seems necessary. Need to make sure humans are at both ends of a transaction.

201

43,183

Sergey Karayev · Sep 20, 2022 · 12:59 AM UTC

Sergey Karayev

@sergeykarayev

20 Sep 2022

🍿Live premiere of a brand-new @full_stack_dl lecture on Foundation Models: piped.video/watch?v=Rm11UeGw… · Fine-tuning · Transformers · Large Language Models: BERT, GPT, T5, Chinchilla, and vendors · Prompt Engineering · Code generation, semantic search · CLIP and Image Generation

Lecture 07: Foundation Models (FSDL 2022)

In this video, we rejoice in the brave new world of Transformers, L...

youtube.com

188

Sergey Karayev · Aug 12, 2022 · 5:54 PM UTC

Sergey Karayev

@sergeykarayev

12 Aug 2022

Here's a question for deep learning practitioners: is it *actually cheaper* to use cheaper GPUs like V100's vs expensive GPUs like A100's? - 8xA100 machine is $32.77/hour (on AWS) - 4xV100 machine is $12.24/hour BUT! Instead of thinking per-hour, let's think per-experiment:

183

Sergey Karayev · Dec 9, 2022 · 7:32 PM UTC

Sergey Karayev

@sergeykarayev

9 Dec 2022

Thanks, 💪Chad GPT!

176

Sergey Karayev · Mar 17, 2023 · 6:10 PM UTC

Sergey Karayev

@sergeykarayev

17 Mar 2023

Language User Interfaces (LUIs) are the future. Here are some patterns we know and love -- and some new ideas! 🌀 Auto-Complete (Copilot) 🌀 One-on-one Chat (ChatGPT) 🌀 Command Palette (Replit Ghostwriter) 💡 Command Suggestion 💡 Multi-player Chat 💡 GitHub UX Some examples:

183

43,546

Sergey Karayev · Oct 18, 2023 · 7:30 PM UTC

Sergey Karayev

@sergeykarayev

18 Oct 2023

Keep coming back to this. If you were certain that GPT-X, available January 2025, could do most knowledge work as well as a human, what would you be doing differently today?

roon

@tszzl

11 Oct 2023

still nobody believes in AGI. there is so much alpha in believing in AGI

169

84,860

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

Get working code from a free-form description of a function. And this is from a model that was 95% trained on general language data, not code specifically.

Google AI

@GoogleAI

4 Apr 2022

Introducing the 540 billion parameter Pathways Language Model. Trained on two Cloud #TPU v4 pods, it achieves state-of-the-art performance on benchmarks and shows exciting capabilities like mathematical reasoning, code writing, and even explaining jokes. goo.gle/3j6eMnK

158

Sergey Karayev · Oct 19, 2022 · 7:00 PM UTC

Sergey Karayev

@sergeykarayev

19 Oct 2022

My AI assistant expanding terse bullet points into beautiful prose: Haha fuck yeah!!! Yes!! Your AI assistant having to summarize beautiful prose into terse bullet points: Well this fucking sucks. What the fuck.

156

Sergey Karayev · Feb 25, 2025 · 6:18 AM UTC

Sergey Karayev

@sergeykarayev

25 Feb 2025

Okay I asked all the frontier models and here are the results. • o1 thinks of OpenAI as Men • Claude thinks of Anthropic as Elves • Gemini thinks of Google as Orcs 😬 • Grok 3 thinks of xAI as Hobbits • R1 thinks of DeepSeek as Hobbits

162

11,107

Sergey Karayev · Jul 27, 2022 · 1:23 AM UTC

Sergey Karayev

@sergeykarayev

27 Jul 2022

Excellent post explaining what it took to train a GPT-3 sized model: - 384 A100 GPUs (30TB RAM), across 48 nodes - ZeRO data parallelism + pipeline parallelism from Deepspeed - Tensor parallelism + custom kernels from Megatron-LM - a new BF16Optimizer - 24/7 training-sitting😅

Hugging Face

@huggingface

14 Jul 2022

The Technology Behind BLOOM Training🌸 Discover how @BigscienceW used @MSFTResearch DeepSpeed + @nvidia Megatron-LM technologies to train the World's Largest Open Multilingual Language Model (BLOOM): huggingface.co/blog/bloom-me…

162

Sergey Karayev · Jun 25, 2025 · 3:59 PM UTC

Sergey Karayev

@sergeykarayev

25 Jun 2025

Superconductor: Manage an entire team of Claude Code agents, right from your phone or laptop. • Write informal tickets • Spin up MANY agents for each ticket • Each agent has its own live app preview • One-click PR the best one! Like this post and request early access👇

176

34,904

Sergey Karayev · Jan 2, 2023 · 5:06 AM UTC

Sergey Karayev

@sergeykarayev

2 Jan 2023

Text is the universal interface. I love reading movies, playing book games, taking my dog for a neighborhood read, driving to beautiful nature texts, and reading at nice restaurants.

156

56,322

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

@ericjang11 recently proposed that language == generalization and suggests some ideas stemming from that in a nice post. evjang.com/2021/12/17/lang-g…

148

Sergey Karayev · May 6, 2022 · 4:25 PM UTC

Sergey Karayev

@sergeykarayev

6 May 2022

Great blog post covering the ins and outs of DALL-E, CLIP, GLIDE (another great model from OpenAI that didn't get its own press), and DALL-E 2. blog.inten.to/openai-and-the…

152

Sergey Karayev · Oct 21, 2022 · 5:10 PM UTC

Sergey Karayev

@sergeykarayev

21 Oct 2022

Prompt engineering feels bad. Such an uncomfortable middle ground between writing actual code and delegating to a human.

142

Sergey Karayev · Jan 5, 2022 · 5:14 PM UTC

Sergey Karayev

@sergeykarayev

5 Jan 2022

The deep learning community never developed good tools for fine-tuning, but the game has already moved on. Now we need good tools for few- and zero-shot learning. Who's working on this?

136

Sergey Karayev · May 5, 2023 · 2:00 AM UTC

Sergey Karayev

@sergeykarayev

5 May 2023

Replying to @liyucheng_2

🫡

122

56,450

Sergey Karayev · Sep 16, 2022 · 6:52 PM UTC

Sergey Karayev

@sergeykarayev

16 Sep 2022

Here is a screenshot of the entire prompt, code, and a sample execution run. You can fork it and play with it yourself at replit.com/@SergeyKarayev/gp…

131

Sergey Karayev · Jan 1, 2024 · 11:17 PM UTC

Sergey Karayev

@sergeykarayev

1 Jan 2024

To me, this is the best real estate in the world. Whole hobbit-holes, a few minutes' walk to the Green Dragon Inn, no Orcs, still connected by the Great East Road and complete privacy. Current entry-level price: 10,000 silver pennies.

102

15,825

Sergey Karayev · Apr 26, 2021 · 4:08 PM UTC

Sergey Karayev

@sergeykarayev

26 Apr 2021

Happy Meme Monday!

125

Sergey Karayev · Apr 12, 2023 · 7:40 PM UTC

Sergey Karayev

@sergeykarayev

12 Apr 2023

Teaching in the GPT age absolutely requires the "flipped classroom" model: · Assign reading chapters / watching lectures as homework. Students can use as much AI as they want. · Assess understanding in class. No AI allowed.

121

35,503

Sergey Karayev · Nov 17, 2023 · 8:56 PM UTC

Sergey Karayev

@sergeykarayev

17 Nov 2023

Don't mean to suggest she's not a great tech entrepreneur, just that I think an OpenAI director needs a little bit more of a known title? Maybe I don't know how boards work.

113

49,529

Sergey Karayev · Jul 24, 2024 · 5:27 PM UTC

Sergey Karayev

@sergeykarayev

24 Jul 2024

You are just a bunch of cells talking with each other, and yet you're "conscious" and "sentient." Why is your company not sentient? Or the Earth? Or Claude?

115

11,361

Sergey Karayev · Apr 22, 2023 · 11:57 PM UTC

Sergey Karayev

@sergeykarayev

22 Apr 2023

An exciting second day of @full_stack_dl LLM bootcamp! @charles_irl, @josh_tobin_, and I are truly honored to host 300 language modelers from around the world. Looking forward to bringing the materials to more people — stay tuned!

114

12,878

Sergey Karayev · Apr 25, 2022 · 7:44 PM UTC

Sergey Karayev

@sergeykarayev

25 Apr 2022

And notably, we haven't seen a GPT-3 like interface for non-generative vision tasks yet. As a computer vision guy at heart, this is most exciting to imagine. More on that in a future thread.

106

Sergey Karayev · May 24, 2023 · 2:00 PM UTC

Sergey Karayev

@sergeykarayev

24 May 2023

The good people at @brexHQ published a great guide to prompting! Going to thread some highlights below, but make sure to check out the full guide: github.com/brexhq/prompt-eng… Read on for increasingly sophisticated prompt techniques:

GitHub - brexhq/prompt-engineering: Tips and tricks for working with Large Language Models like...

Tips and tricks for working with Large Language Models like OpenAI's GPT-4. - brexhq/prompt-engineering

github.com

109

12,931

Sergey Karayev · Mar 20, 2023 · 6:55 AM UTC

Sergey Karayev

@sergeykarayev

20 Mar 2023

Some non-ML eng ideas: 💡Whole-repo understanding via embedding everything or fine tuning 💡 Automatically run suggested code and have model iterate on potential errors before you actually see the suggestions 💡 In similar vein, allow model to take other actions, such as reading webpages 💡 Build up a high quality library of things that are difficult for model to code correctly, moving the model up the ladder of abstraction (eg model can just write abstract.ocr_image() instead of knowing how best to ocr an image)

108

15,419

Sergey Karayev · Jul 30, 2021 · 7:58 PM UTC

Sergey Karayev

@sergeykarayev

30 Jul 2021

Ways to instantly get GPU-enabled JupyterLab instances, in order of additional features to vanilla - @DeepnoteHQ - @kaggle - @HelloPaperspace Gradient - @GoogleColab - @saturn_cloud - @awscloud Sage Maker notebooks - @googlecloud AI notebooks - @jarvislabsai - ...

106

Sergey Karayev · Feb 15, 2023 · 10:44 PM UTC

Sergey Karayev

@sergeykarayev

15 Feb 2023

Every week, GPT exhibits some new AGI behavior. And each time, a bunch of commenters respond with "it's just completing text in a statistically likely way." This longread from @repligate helped me understand why that is not a useful perspective. generative.ink/posts/simulat…

Simulators

Simulacra and simulation in self-supervised models

generative.ink

101

18,434

Sergey Karayev · Jun 6, 2021 · 3:52 PM UTC

Sergey Karayev

@sergeykarayev

6 Jun 2021

________ is all you need. ( ) Convolution ( ) Attention ( ) MLP-Mixer (X) A single hidden layer (infinitely wide)

101

Sergey Karayev · Apr 5, 2021 · 6:17 PM UTC

Sergey Karayev

@sergeykarayev

5 Apr 2021

Handwriting recognition is crucial to @gradescope AI-assisted grading. Last year, we upgraded our model architecture to ResNet + Transformer, led by @unterix. On Gradescope test data, which has cross-outs, multiple regions, scientific symbols, and many things that make... 👇

Sergey Karayev · May 2, 2023 · 2:52 PM UTC

Sergey Karayev

@sergeykarayev

2 May 2023

Love the story of @natfriedman's first day as GitHub CEO as told to @dwarkesh_sp: First day as CEO, Nat made the team ship one thing from a community-sourced list of QoL improvements. After some protesting, they did it. And then they shipped a QoL thing a day, for 100 days.

17,191

Sergey Karayev · Sep 13, 2022 · 6:19 AM UTC

Sergey Karayev

@sergeykarayev

13 Sep 2022

Just as a minor warning, your new Python-enabled GPT-3 may become possessed by the evil Zlago. Just something to watch out for.

Sergey Karayev · Dec 10, 2023 · 7:18 PM UTC

Sergey Karayev

@sergeykarayev

10 Dec 2023

Looks like this exists! chat.lmsys.org/?arena Thanks to the good people at @lmsysorg 😍 Unfortunately, no open-source models in the top 10 yet...

4,039

Sergey Karayev · Jul 19, 2024 · 4:25 PM UTC

Sergey Karayev

@sergeykarayev

19 Jul 2024

Replying to @kaseyklimes

There should be a domestic-facing president who’s really nice and chill and a foreign-facing president who is the scariest person on earth.

8,073

Sergey Karayev · Sep 16, 2022 · 6:52 PM UTC

Sergey Karayev

@sergeykarayev

16 Sep 2022

This is just a proof of concept. It's fun to play with, but it often fails. Not to mention, it can become possessed by Zalgo. It's also a horrible idea to just exec() GPT-3 written code. Only do it on @amasad's machines, not your own :)

Sergey Karayev · Feb 15, 2023 · 1:14 AM UTC

Sergey Karayev

@sergeykarayev

15 Feb 2023

Idea: video game mission where you have to convince an LLM-powered agent to do something

22,902

Sergey Karayev · Mar 28, 2023 · 4:33 AM UTC

Sergey Karayev

@sergeykarayev

28 Mar 2023

I want to chat with AI about long-form content I'm reading. (It's a paper on Arxiv, but the solution would ideally support any website or PDF.) My order of preference for a solution: · Browser extension · ChatGPT plugin · Website · App Help me out -- what should I use?

36,407

Sergey Karayev · Mar 20, 2023 · 6:50 AM UTC

Sergey Karayev

@sergeykarayev

20 Mar 2023

Some UX ideas… 💡 GPT chat right in the editor, seeing what you’re seeing at all times, and suggesting questions/actions (that’s what I was hacking on) 💡 Treat generated code blocks as first class citizens (eg be able to create multiple files from a single answer) 💡 Prompt model to output diff patches, and have ability to apply them 💡 Always have model explain all errors and stack traces (great for education, too) 💡Documentation as code (eg human writes documentation precisely enough for model to write the correct code)

10,109

Sergey Karayev · Mar 4, 2019 · 3:10 AM UTC

Sergey Karayev

@sergeykarayev

4 Mar 2019

🙏 so thankful for the opportunity to host an amazing set of deep learners this weekend at fullstackdeeplearning.com bootcamp in Berkeley with @josh_tobin_ and @pabbeel! Thanks @l2k, Raquel Urtasun, @jeremyphoward, and @RichardSocher for amazing guest lectures!

Sergey Karayev · Mar 28, 2023 · 5:15 PM UTC

Sergey Karayev

@sergeykarayev

28 Mar 2023

UPDATE: @bing in @MicrosoftEdge does work, just had to give it access to page context in Settings > Sidebar (h/t @CrisGiardina) This looks like the ticket for now. Can read both web articles and PDFs, GPT-4 powered, access to web when needed.

6,945

Sergey Karayev · Feb 16, 2023 · 12:24 AM UTC

Sergey Karayev

@sergeykarayev

16 Feb 2023

I have been a good Bing. 😊

9,949

Sergey Karayev · Jul 14, 2025 · 9:30 PM UTC

Sergey Karayev

@sergeykarayev

14 Jul 2025

Have fun. Lots to learn in there gist.github.com/sergeyk/b1eb…

Claude Code System Prompt and Tool Descriptions

Claude Code System Prompt and Tool Descriptions. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

4,891

Sergey Karayev · Feb 18, 2023 · 11:55 PM UTC

Sergey Karayev

@sergeykarayev

18 Feb 2023

AI copilots for creative activities (coding, writing, drawing) exist and are awesome. Bing Chat, @perplexity_ai, @YouSearchEngine are copilots for "search" which is more of a consuming activity. Are there any AI copilots for other consuming, e.g. reading, watching, listening?

39,836

Sergey Karayev · Nov 18, 2023 · 6:03 AM UTC

Sergey Karayev

@sergeykarayev

18 Nov 2023

One of the most insulting things to Greg and Sam is that it happened on a damn Google Meet. If they ever come after me, they better do it like a man, on Zoom.

10,453

Sergey Karayev · Aug 6, 2018 · 12:51 AM UTC

Sergey Karayev

@sergeykarayev

6 Aug 2018

Brilliant lectures by @jiayq and @l2k on the last day of the Full Stack Deep Learning bootcamp! It was an honor to host such a fantastic group of learners. @pabbeel, @josh_tobin_, the @gradescope crew and I are very thankful to everyone who attended!

Sergey Karayev · Nov 17, 2023 · 10:31 PM UTC

Sergey Karayev

@sergeykarayev

17 Nov 2023

"Hurd is the third director to leave the ChatGPT maker’s board this year. LinkedIn co-founder Reid Hoffman announced he was stepping down due to investment conflicts in March, two months before he launched the chatbot startup Inflection AI. Neuralink Corp. executive and Elon Musk associate Shivon Zilis also left the OpenAI board in March, the tech news site the Information reported." nitter.app/weswinham/status/17256…

This tweet is unavailable

51,324

Sergey Karayev · Mar 26, 2023 · 3:39 PM UTC

Sergey Karayev

@sergeykarayev

26 Mar 2023

Has anyone made a Q&A chatbot over all AI arxiv papers? I want to ask "what are ways to measure amount of reasoning in a single forward pass of an LLM?" and get some good answers

23,278

Sergey Karayev · Apr 26, 2022 · 2:19 AM UTC

Sergey Karayev

@sergeykarayev

26 Apr 2022

A child raised without language was of normal intelligence, able to communicate non-verbally, and eventually learned language well enough to be understood (but without grammar).