My 8000-word note on agents: huyenchip.com//2025/01/07/ag… Covering: 1. An overview of agents 2. How the capability of an AI-powered agent is determined by the set of tools it has access to and its capability for planning 3. How to select the best set of tools for your agent 4. Whether LLMs can plan and how to augment a model’s capability for planning 5. Agent’s failure modes AI-powered agents are an emerging field with no established theoretical frameworks for defining, developing, and evaluating them. This post is a best-effort attempt to build a framework from the existing literature, but it will evolve as the field does. As always, feedback is much appreciated!
58
455
2,817
381,226
Sam!!!
Small-but-happy win: If you tell ChatGPT not to use em-dashes in your custom instructions, it finally does what it's supposed to do!
190
191
10,369
1,472,395
$100 for anyone who can show me how to get ChatGPT to stop using emdashes. it's driving me insane
1,308
219
8,821
3,843,749
This thread is a combination of 10 free online courses on machine learning that I find the most helpful. They should be taken in order.
85
2,039
7,341
I wrote an 8k word doc on machine learning systems design. This covers: 1. Project setup 2. Data pipeline 3. Training & debugging 4. Serving with case studies, resources, and 27 exercises. This is the 1st draft so feedback is much needed. Thank you! github.com/chiphuyen/machine…
75
1,590
6,860
Machine learning engineering is 10% machine learning and 90% engineering.
74
483
6,596
OMG I'M SO HAPPY IT'S FINALLY HERE!!!
139
260
6,405
Things I’d prioritize learning if I was to study to become a ML engineer again: 1. Version control 2. SQL + NoSQL 3. Python 4. Pandas/Dask 5. Data structures 6. Prob & stats 7. ML algos 8. Parallel computing 9. REST API 10. Kubernetes + Airflow 11. Unit/integration tests
122
1,125
6,356
Finally got my copy! “AI Engineering” is officially out 🙏 🎉
217
388
6,256
421,669
It’s done! 150,000 words, 200+ illustrations, 250 footnotes, and over 1200 reference links. My editor just told me the manuscript has been sent to the printers. - The ebook will be coming out later this week. - Paperback copies should be available in a few weeks (hopefully before the end of the year). Preorder: amzn.to/49j1cGS - The full manuscript is also accessible on O'Reilly platform: oreillymedia.pxf.io/c/571911… This wouldn’t have been possible without the help of so many people who reviewed the early drafts, answered my thousands of questions, introduced me to fascinating use cases, or helped me see the beauty of overlooked techniques. Thank you everyone for making this happen!
171
587
5,738
355,617
My date: “You’re my number 1.” Me: “Are you zero indexed or one indexed?” Me: *single*
50
455
3,789
Sooo I wrote a 13,000-word lecture note on data distribution shifts, monitoring, and causes of ML failures. This was very difficult for me to write, because academia & industry literature use very different terminology. 
Feedback appreciated 🙏 docs.google.com/document/d/1…
40
674
3,669
So I wrote a 5400-word lecture note on the basics of data engineering for my students, covering: * data formats (row- vs. column-based, text vs. binary) * ETL * batch processing vs. stream processing * training datasets WIP. Feedback much appreciated! docs.google.com/document/d/1…
76
676
3,567
My editors just shared with me the feedback from early reviewers and I'm in tears 😭 With the help of so many people, I worked really hard on this book. I'm grateful that people gave it a chance. Read the book online: learning.oreilly.com/library… Pre-order: amazon.com/Designing-Machine…
74
379
3,461
I made a notebook with examples of cool Python features that either took me a long time to find out or were too intimidating for me to use. I especially focus on the features I find useful for machine learning. github.com/chiphuyen/python-…
31
760
3,306
Making slides for my course. Trying my best to prepare students for their machine learning job in the industry. What did you wish you knew before you deployed your first ML model?
90
545
3,268
I saw a guy debugging his model today. No idpb. No unittest. No visualization. No disabling regularization. He just sat there staring at every line of code and cursing TensorFlow. Like every machine learning researcher I know.
46
414
3,043
When talking to people who haven’t deployed ML models, I keep hearing a lot of misperceptions about ML models in production. Here are a few of them. (1/6)
24
721
2,847
I’m excited to share that I’m working on a new book about building applications with foundation models! AI Engineering builds upon Machine Learning Systems Design, but with a focus on large scale, ready made models. The book covers: - The new AI stack (e.g. how it differs from traditional ML engineering) - Different approaches to evaluate open-ended systems - Dataset engineering - Prompt engineering, RAG, agents - Finetuning - Compute infrastructure, including how to mitigate latency and cost AI Engineering is scheduled for late 2024. The first 3 chapters are available on the O'Reilly platform: learning.oreilly.com/library… I’ve learned a lot during the research and writing process for this book. I hope you’ll find the learnings useful. Feedback is much appreciated!
55
307
2,705
212,633
Things I keep telling myself I want to learn to be proficient but probably will just keep googling the same questions for the rest of my life 1. Regex 2. SQL 3. Git 4. Vim 5. Bash 6. JavaScript 7. Matplotlib 8. How to cook chicken
70
176
2,474
An early draft of the machine learning interviews book is out 🥳 The book is open-sourced and free. Job search is a stressful process, and I hope that this effort can help in some way. Contributions and feedback are appreciated! huyenchip.com/ml-interviews-…
44
526
2,522
Is your model fast? No Is it cheap? No Does it at least solve our problem? No … But it’s StAtE oF tHe ArT
28
224
2,366
I love teaching and I’ve always wanted to come back to Stanford to teach. After 2 years with the help of many people, it's happening! I’ll be teaching Machine Learning Systems Design in Jan '21. It'll be online. Syllabus is below. Feedback appreciated! huyenchip.com/2020/10/27/ml-…
65
232
2,332
Two books, nine years apart 🥳 I'm so happy. I still can't believe that a US publisher wants to publish my book! Grateful to so many people who made this happen. Designing Machine Learning Systems is scheduled for early 2022. First 3 chapters are here oreilly.com/library/view/des…
49
194
2,241
I’m slowly beginning to accept that my productivity, when working with AI coding agents, is limited by my human brain. AI can do many tasks in parallel, but I can only track the context of a few, so I only run a few tasks at a time. I am the bottleneck.
192
174
2,289
150,738
I'm working on a book on machine learning interviews so I've been spending the last few months talking to companies about their hiring process for ML roles. This thread is a summary of what I've learned. It will be updated as the book progresses. (1/n)
46
533
2,199
My students are slowly realizing that if they want to run their models in browsers, they can't avoid JavaScript 🤪
72
182
1,983
I saw a guy debugging his model today. No idpb. No unittest. No visualization. No disabling regularization. He just sat there staring at every line of code and cursing TensorFlow. Like every machine learning researcher I know.
11
609
1,965
To learn how to design machine learning systems, I find it really helpful to read case studies to see how great teams deal with different deployment requirements and constraints. Here are some of my favorite case studies.
23
503
1,947
Guys, hear me out. What if instead of showing students that same convnet picture in 10 different classes, some classes teach ML students engineering principles? Idk, something like unit test, CI/CD, performance profiling, typing, encoding, databases, etc.?
54
163
1,863
Open challenges in LLM research The first two challenges, hallucinations and context learning, are probably the most talked about today. I’m the most excited about 3 (multimodality), 5 (new architecture), and 6 (GPU alternatives). Number 5 and number 6, new architectures and new hardware, are very challenging, but are inevitable with time. Because of the symbiosis between architecture and hardware – new architecture will need to be optimized for common hardware, and hardware will need to support common architecture – they might be solved by the same company. I referenced a lot of papers here, but I have no doubt that I still missed a ton. If there’s something you think I missed, please let me know! huyenchip.com/2023/08/16/llm…
51
392
1,843
335,581
After my post on real-time machine learning last year, many people asked me how to do it. This post discusses the challenges + solutions for online prediction, online evaluation, and continual learning, with use cases and examples. Feedback appreciated! huyenchip.com/2022/01/02/rea…
32
394
1,780
Takeaway from @karpathy's CVPR talk: The most successful ML projects in prod (Tesla, iPhone, Amazon drones, Zipline) are where you own the entire stack. They iterate not just ML algorithms but also: - how to collect/label data - infrastructure - hardware ML models run on
19
253
1,742
The right tools can make you much more productive. Some of the favorite OSS tools I discovered recently. 1. JAX 2. jupytext 3. Streamlit 4. excalidraw 5. Facets 6. D3 7. SHAP 8. Prefect 9. PyTorch Lightning 10. Prometheus
21
264
1,588
I was never a good systems engineer, so I always avoided the topic of compiling and optimizing ML models. However, as I work with ML on devices, the topic keeps coming up. So I spent the last 3 months learning about ML compilers. Here’s what I learned. huyenchip.com/2021/09/07/a-f…
18
271
1,524
A conference just asked me to be a speaker because they *desperately* need a woman to put on the poster, which is currently all male. If you think it's flattering, it's not. Getting a last-minute minority speaker to avoid backlash isn't diversity. Please stop using us.
24
206
1,407
I looked at 200 tools for developing & deploying machine learning applications: - how the market evolved over time - the difference between ML applications that traditional software engineering applications - open-source vs open-core business model huyenchip.com/2020/06/22/mlo…
13
394
1,489
During the process of writing AI Engineering, I went through so many papers, case studies, blog posts, repos, tools, etc. This repo contains ~100 resources that really helped me understand various aspects of building with foundation models. github.com/chiphuyen/aie-boo… Here are the highlights: 1. Anthropic’s Prompt Engineering Interactive Tutorial The Google Sheets-based interactive exercises make it easy to experiment with different prompts and see immediately what works and what doesn’t. I’m surprised other model providers don’t have similar interactive guides: docs.google.com/spreadsheets… 2. OpenAI’s best practices for finetuning While this guide focuses on GPT-3, many techniques are applicable to full finetuning in general. It explains how finetuning works, how to prepare training data, how to pick training hyperparameters, and common finetuning mistakes: docs.google.com/document/d/1… 3. Llama 3 paper The section on post-training data is a gold mine as it details different techniques they used to generate 2.7 million examples for supervised finetuning. It also covers a crucial but less talked about topic: data verification, how to evaluate the quality of synthetic data: arxiv.org/abs/2407.21783 4. Efficiently Scaling Transformer Inference (Pope et al., 2022) An amazing paper co-authored by Jeff Dean about inference optimization for transformers models. It covers not only different optimization techniques and their tradeoffs, but also provides a guideline for what to do if you want to optimize for different aspects, e.g. lowest possible latency, highest possible throughput, or longest context length: arxiv.org/abs/2211.05102 5. Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models (Lu et al., 2023) My favorite study on LLM planners, how they use tools, and their failure modes. An interesting finding is that different LLMs have different tool preferences: arxiv.org/abs/2304.09842 6. AI Incident Database For those interested in seeing how AI can go wrong, this contains over 3000 reports of AI harms: incidentdatabase.ai/ 7. I find case studies from teams that have successfully deployed AI applications extremely educational. Here are some of my favorite enterprise case studies. I'll add more case studies soon! - LinkedIn: linkedin.com/blog/engineerin… - Pinterest's Text-to-SQL: medium.com/pinterest-enginee… - Gmail’s Smart Compose (2019): arxiv.org/abs/1906.00080 - Grab: engineering.grab.com/llm-pow…
31
234
1,492
102,948
Based on conversations with a dozen companies doing real-time ML in both US and China, I wrote about online inference and online learning -- their use cases, solutions, and challenges. Machine learning is going real-time, and most companies aren't ready. huyenchip.com/2020/12/27/rea…
31
350
1,474
The secret to good writing is to publish it before you get bored of it.
14
197
1,432
The more I work in prod, the more I realize how hard it is to do prob & stats right. Splitting data is easy. Sampling data correctly is hard. Writing metrics is easy. Understanding what metrics measure is hard. Monitoring numbers is easy. Interpreting them is very, very hard.
15
179
1,428
Pandas is great for most day-to-day data analysis, but it has many quirks that can cause mysterious bugs or performance issues. Here is a list of pandas things I’ve learned that have made my life so much easier. As always, feedback is much appreciated! github.com/chiphuyen/just-pa…
9
189
1,387
ML community: * create algorithms that optimize for a single objective * Companies: * use ML to optimize for user engagement, which learns to favor extreme content since it gets the most attention * ML community: “Why are people on Twitter getting so extreme?”
15
155
1,345
I went through the most popular AI repos on GitHub, categorized them, and studied their growth trajectories. Here are some of the learnings: 1. There are 845 generative AI repos with at least 500 stars on GitHub. They are built with contributions from over 20,000 developers, making almost a million commits. 2. I divided the AI stack into four layers: application, application development, model development, and infrastructure. The application and application development layers have seen the most growth in 2023. The infrastructure layer remains more or less the same. Some categories that have seen the most growth include AI interface, inference optimization, and prompt engineering. 3. The landscape exploded in late 2022 but seems to have calmed down since September 2023. 4. While big companies still dominate the landscape, there’s a rise in massively popular software hosted by individuals. Several have speculated that there will soon be billion-dollar one-person companies. 5. The Chinese’s open source ecosystem is rapidly growing. 6 out of 20 GitHub accounts with the most popular AI repos originate in China, with two from Tsinghua University and two from Shanghai AI Lab.
38
297
1,362
346,212
I've been thinking about the software stack for machine learning. Tools I'd love to see. 1. Pip for pretrained models. 2. Version control for datasets. 3. GPU-friendly CI. Travis CI, Circe CI don't support GPUs. Jenkins is a pain. 4. Fast dataframes. Why is Pandas so slow?
60
188
1,350
Reasons I love writing and think everyone should write more: 1. To organize your thoughts 2. To learn: the best way to learn is to write/teach 3. To keep you accountable 4. To document your progress 5. To write better 6. To put your name out there 7. To enjoy the beauty of words
20
218
1,350
Omg just heard about this new cool framework numpy I think I'm gonna use it to replace tensorflow
The NumPy paper is out! nature.com/articles/s41586-0…
18
109
1,325
I'm using AI so much for work that I can tell how productive I am by how many conversations I've had with AI.
35
59
1,307
117,307
Some of the smartest people I know are leaving AI research for engineering/neuroscience. Their reasons? 1. We need to understand how humans learn to teach machines to learn. 2. Research should be hypothesis -> experiments, but AI research rn is experiments -> justifying results.
39
230
1,293
lazynlp: a library to scrape, clean, de-duplicate webpages to create massive datasets. It has instructions to download Reddit URLs, Gutenberg books, and Wikipedia. Using this library, you should be able to create text datasets >40GB. github.com/chiphuyen/lazynlp
7
299
1,250
Summary of Gemini's 60-page technical report. 1. Written in Jax and trained using TPUs. The architecture, while not explained in details, seems similar to Flamigo's. 2. Gemini Pro's performance is similar to GPT-3.5 and Gemini Ultra is reported to be better than GPT-4. Nano-1 (1.8B params) and Nano-2 (3.25B params) are designed to run on-device. 3. 32K context length. 4. Very good at understanding vision and speech. 5. Coding ability: the big jump in HumanEval compared to GPT-4 (74.4% vs. 67%), if true, is awesome. However, the Natural2Code benchmark (no leakage on the Internet) shows a much smaller gap (74.9% vs. 73.9%). 6. On MMLU: using COT@32 (32 samples) to show that Gemini is better than GPT-4 seems forced. In 5-shot setting, GPT-4 is better (86.4% vs. 83.7%). 7. No information at all on the training data, other than they ensured "all data enrichment workers are paid at least a local living wage." Full report: storage.googleapis.com/deepm…
18
217
1,231
217,188
Building a platform for generative AI applications huyenchip.com/2024/07/25/gen… After studying how companies deploy generative AI applications, I noticed many similarities in their platforms. This post outlines these common components, what they do, and implementation considerations. This post starts from the simplest architecture and progressively adds more components. 1. Enhance context input into a model by giving the model access to external data sources and tools for information gathering. 2. Put in guardrails to protect your system and your users. 3. Add model router and gateway to support complex pipelines and add more security. 4. Optimize for latency and costs with cache. 5. Add complex logic and write actions to maximize your system’s capabilities. I try my best to keep the architecture general, but certain applications might deviate. As always, feedback is appreciated!
19
230
1,258
107,676
I'm uncomfortable when papers make claims like "we beat SOTA by ..." or "our results are better than []". They turn research into a competition, not collaboration. Wouldn't it be nice if we can say: "Built upon the work of [], we're able to improve the results by ..."?
22
122
1,183
It's incredibly sad for me to say that my time at NVIDIA has ended. I'm grateful for the chance to work w/ so many wonderful people on challenging projects. As I'm going on a new adventure, I put down a quick note on the lessons I learned over the year. huyenchip.com/2019/12/23/lea…
36
117
1,224
We used to have so many problems. Then we started using machine learning. Now we have so many problems in Kubernetes.
16
78
1,151
During both my CS undergrad and master's at Stanford, nobody -- not a professor, not a TA -- told me about unit tests. I learned about it from someone who went to a school whose name I had never heard of. Apparently, you don’t need to know about testing to get fancy internships.
47
158
1,160
Looking back, my last decade was like a neural network. Some parts were linear. Some were nonlinear. I never seemed to get enough data, and always got stuck in local minima. There was a lot of learning. I can't explain how any of it worked, but the results came out alright.
17
130
1,150
Everyone is talking about Instagram boyfriends and I'm just sitting here wishing I had a GitHub boyfriend who'd review my code before I publish them
31
74
1,140
I open sourced Sniffly, a tool that analyzes Claude Code logs to help me understand my usage patterns and errors. Key learnings. 1. The biggest type of errors Claude Code made is Content Not Found (20 - 30%). It tries to find files or functions that don't exist. So I restructured my code base for discoverability, and the average number of steps Claude Code needs for each instruction went from 8 to 7 steps.
45
127
1,204
154,740
I've been at this startup for less than a month and I've been exposed to so many problems I didn't even know existed. I'm of the increasing belief that everyone should try a startup early in their career, e.g. within the first 3 years/before settling into complacency.
23
103
1,140
First 2 lectures' slides/notes on machine learning systems design are up! Covering: - Batch vs. online prediction, edge vs. cloud, offline vs. online learning - Multiple-objective optimization - When & when not to use ML ... Feedback much appreciated! docs.google.com/presentation…
23
214
1,121
8 years ago today, I lost my grandpa who raised me. They told me he died saying my name. He wanted to see me one last time, and I couldn't make it. If you love someone, please make time to see them. I'd give the world to see my grandpa again, but even the world isn't enough.
20
81
1,079
Really enjoyed LinkedIn's report on what worked and what didn't when deploying LLM applications. 4 takeaways. 1. Structured outputs They chose YAML over JSON as the output format because YAML uses less tokens. Initially, only 90% of the outputs are correctly formatted YAML. They used re-prompting (asking the model to fix its YAML responses), which increased the number of API calls significantly. They then analyzed the common formatting errors, added those hints to the original prompt, and wrote an error fixing script. This reduced their errors to 0.01%.
12
162
1,110
157,930
Memories are just dimensionality reduction of the past.
25
157
1,049
- 2.7 million have enrolled in Andrew Ng’s Machine Learning course - Geoffrey Hinton has been cited 340k times - TensorFlow has been used in 60k OSS projects Hypothesis: in 5 years, when these millions of students have gained hands-on experience, we'll have AI skills overflow.
49
126
1,063
Simpson's paradox is one of many reasons why it’s important to evaluate your models on different slices of data. Model 1 outperforms model 2 on group A and group B separately, but model 2 can still outperform model 1 overall. Statistics is amazing.
25
143
1,050
I updated my list of ML tools: - 84 new tools (total 284) + interactive graph - overview of MLOps landscape 2020 - ML tooling startups that have raised money in 2020. More than half are outside the Bay Area. Growing hubs: Boston, NYC, Tel Aviv. huyenchip.com/2020/12/30/mlo…
19
191
1,060
2018: BERT looks good on leaderboards but it’s not practical because it's too big too slow 2020: BERT is used in almost every English search on Google There are many engineering challenges to bring large ML models to production, but they are being solved at a great speed 🚀
10
112
1,039
I analyzed compensation & level details of 19k tech workers to find answers to: 1. How long does it take for SWEs to reach a certain level? 2. Compensations across jobs/levels? 3. Do women get paid less than men in tech? 4. Is there a deadline for SWEs? huyenchip.com/2020/01/18/tec…
18
245
1,007
It’s incredible how well Microsoft has been able to build their ecosystem. VSCode: write code GitHub: share code Azure: run code Windows: use code OpenAI: code is temporary. AGI is forever
8
83
988
10 hours fixing a bug = 5h looking for that bug + 1m fixing + 4h59m complaining to everyone about it
13
109
966
Ayy how's your Saturday night going I just spent the last 3 hours trying to find that extra comma in a 20k row csv file and I'm beginning to question my entire career choice
46
48
966
Most courses only teach you how to train your models. This is only one I've seen that shows you how to design, train, & deploy models. All videos are available. Great resource for those struggling with the ML system design Qs in interviews too. fullstackdeeplearning.com/ma…
6
235
954
I asked Claude Code to fix my bug and it just refused lol "Your app is working fine. This is a minor issue that doesn't break core functionality."
66
36
960
208,981
Non-CS people: "How many lines of code do you write a day?" Me: "Um idk minus 200?" Can we take a moment to appreciate all the relentless deleters who are willing to dig into the mess that is our code and make it concise and readable?
9
112
918
Element AI raised $257M and was sold for $230M after 4 years Sad to read this news. I guess this is another piece of evidence that AI startups founded and run by famous academics with good intention don't always have the best execution
36
125
911
Is meme making a respectable career? Asking for a friend
17
62
916
New post: RLHF - Reinforcement Learning from Human Feedback Discussing 3 phases of ChatGPT development, where RLHF fits in, how RLHF works, hypotheses on why it works, and relationship between RLHF and hallucination. huyenchip.com/2023/05/02/rlh…
18
210
929
139,491
The challenge for ML in production is to generalize to constantly changing edge cases. 2 main approaches: 1. Use massive data because more data can lead to better generalization 2. Build infra that allows models to learn to adapt in real-time Hmu if you're excited about #2!
42
76
917
Ayyyy we're out of stealth! @SnorkelML lets you programmatically, rather than manually, label your data and create massive training datasets in hours, not months. It’s also an end-to-end platform for iterative ML development & deployment. Check us out! snorkel.ai/07-14-2020-snorke…
23
138
925
New blog post: Multimodality and Large Multimodal Models (LMMs) Being able to work with data of different modalities -- e.g. text, images, videos, audio, etc. -- is essential for AI to operate in the real world. This post covers multimodal systems in general, including Large Multimodal Models. It consists of 3 parts. * Part 1 covers the context for multimodality. * Part 2 discusses how to train a multimodal system, using the architectures of CLIP and Flamingo, and examples from GPT-4V. * Part 3 discusses some active research areas for LMMs, including generating multimodal outputs. As always, feedback is appreciated! huyenchip.com/2023/10/10/mul…
13
188
909
243,108
When you get a new notification and hope it's your crush but it's just another recruiter asking if you're interested in joining a machine learning startup that doesn't have a client yet but already has a 9-figure valuation 🙄
12
51
888
I get the impression that many computer scientists look down on webdev, but I believe that everyone should learn some basic webdev skills. Being able to quickly create a website to showcase your portfolio can make a huge difference in how people find and perceive your work.
45
83
886
Best advice on attending conferences I've received. Papers & talks are online. The main value is the people. Look up authors/speakers/Twitter in advance to see who's going and email them asking for a quick chat. Talk to everyone, especially those whose work isn't already famous.
5
112
876
Some resources that I’ve found really helpful to understand machine learning in production. 1. Engineering starts with infrastructure. @vtuulos gave a great overview of the relationship between data science and infrastructure at Netflix piped.video/XV5VGddmP24
8
237
843
Now that AI can write linked lists better than us, can we get rid of this kind of interview question yet? piped.video/watch?v=Kv_E0tUT…
20
120
837
I wanted a list of women in Data/ML/AI/Stats to read their thoughts in one place. Within 30 minutes, I found these 100+ awesome women so don’t tell me you can’t find women speakers for your events. Lmk who I'm missing and I'll add them to the list! nitter.app/i/lists/12267802469903…
57
185
842
Some asked me about concept drift so here you go. A predictive ML model learns theta to output P(Y|X; theta). Data drift is when P(X) changes: different data distributions, different feature space. Ex: service launched in a new country, expected features becoming NaNs. 1/5
12
163
862
I love browsing GitHub checking out cool projects. What are your favorite open-source repos? Also, if you’re looking for cool OSS projects for ML, here are 260 of them! github.com/chiphuyen?tab=sta…
21
128
834
If you don't have a date this Valentine's just get more data
19
112
834
It took me a while to recap bc some papers are complex. Key points: - Destructing black box: convergence, generalization, neural tangent kernel - New approaches: Bayesian & uncertainty, graph, convex optimization - Neuroscience & bio-inspired algos - ... huyenchip.com/2019/12/18/key…
12
233
847
The article I’d love to read: Become a machine learning expert in 200 weeks with these 10,000 easy steps
17
82
827
ok apparently machine learning will never replace media pundits because models will need to be able to explain not only their predictions but also why they didn't happen
16
63
816
Many companies seem to want their own in-house LLMs: finetune an open-source LLM on their own data. Here are a few reasons for and against in-house LLMs I can think of. Would love to hear your thoughts.
70
125
823
155,344
After working for a while I realize how hopelessly outdated the concept "years of experience" is. It's so easy to spend years on a job and learn absolutely nothing.
18
119
808
Problems I'd do if I'm to do a startup again (though I probably won't any time soon because startups are hard). If you’re solving any of them, I’d love to chat. 1. Data synthesis: AI has become really good both at generating and annotating data. The challenge now is to make sure that the generated data is safe and legal, e.g. not violating any IP. 2. Evaluation: evaluation has gotten so much harder with LLMs, both because many people treat models as blackboxes (we deploy models someone else developed for us) and because outputs can be open-ended. At the same time, investment in evaluation is nowhere close to investment in model or application development. I’d be interested in arena-style evaluation, embedding evaluation, human-in-the-loop evaluation, as well as small, specialized scorers (instead of using large models like GPT-4 as judges). 3. Energy: the bottleneck to scaling AI is no longer compute but electricity. I’m interested in all energy-related problems, including both new energy sources and energy trading. 4. Any application that allows you to collect unique data that nobody has. I’ve heard concerns about building applications that seem to be “wrappers” around popular APIs. If you can get to the market early and gather sufficient data to continually improve your product, data is your moat. 5. GPU-native everything: many data science toolings, including scikit-learn, pandas, and Spark, aren’t built to run natively on GPUs. There have been efforts to make these tools more efficiently leverage GPUs, but I think there’s still a lot of room for the software layer for GPUs (and not just NVIDIA GPUs). 6. Curated Internet: bots are already ruining dating apps, search (bots are incredibly good at SEOs), and social media. I’d like to be able to set a boundary for my Internet, e.g. to limit the search results to those written by people I trust, or sources verified to be human.
50
89
836
140,453
You don't know what pain is until you've tried to install an older version of CUDA
33
59
815
New post: bringing LLM applications to production! 1. Challenges of LLM engineering & the solutions that I’ve seen 2. How to compose multiple tasks and incorporate tools (e.g. SQL executor, bash, web browsers, third-party APIs) 3. Promising use cases huyenchip.com/2023/04/11/llm…
23
221
818
160,827
New blog post! Discussing: 1. Whether a data scientist should be fullstack 2. What caused the unreasonable expectation that DS should know Kubernetes 3. An overview of the tools that can help abstract away infra to allow DS to own a project end-to-end 🚀 huyenchip.com/2021/09/13/dat…
20
160
800
It amazes me how tool developers underestimate the onboarding process. Every time I see a new tool, I want to try it out, but 90% of the time, I give up because it takes over 2 hours to just set up the right environment and get the tool to even import.
26
52
776