Afroz Mohiuddin · May 18, 2021 · 6:59 PM UTC

Afroz Mohiuddin

Afroz Mohiuddin @afrozenator

18 May 2021

Never imagined MUM would end up in I/O when we first started it ;-) ;-)

Google

@Google

18 May 2021

Multitask Unified Model (MUM) — our latest AI milestone — has the potential to transform how Google helps you with complex information tasks. #GoogleIO

112

Afroz Mohiuddin · Dec 3, 2022 · 7:16 PM UTC

Afroz Mohiuddin @afrozenator

3 Dec 2022

Got published in Nature Communications :D nature.com/articles/s41467-0… With awesome collaborators: @alvin_rajkomar Eric Loreaux, Yuchen Liu, Jonas Kemp, Benny Li, Ming-Jun Chen, Yi Zhang & @Mysiak ...

Afroz Mohiuddin · Apr 5, 2025 · 8:32 PM UTC

Afroz Mohiuddin @afrozenator

5 Apr 2025

Extremely proud to have pioneered large scale distillation for Maverick and really delighted to be working alongside an extremely talented team. We truly hope the OSS community enjoys the fruits of our labour.

AI at Meta

@AIatMeta

5 Apr 2025

Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model with 16 experts. • Industry-leading context window of 10M tokens. • Outperforms Gemma 3, Gemini 2.0 Flash-Lite and Mistral 3.1 across a broad range of widely accepted benchmarks. Llama 4 Maverick • 17B-active-parameter model with 128 experts. • Best-in-class image grounding with the ability to align user prompts with relevant visual concepts and anchor model responses to regions in the image. • Outperforms GPT-4o and Gemini 2.0 Flash across a broad range of widely accepted benchmarks. • Achieves comparable results to DeepSeek v3 on reasoning and coding — at half the active parameters. • Unparalleled performance-to-cost ratio with a chat version scoring ELO of 1417 on LMArena. These models are our best yet thanks to distillation from Llama 4 Behemoth, our most powerful model yet. Llama 4 Behemoth is still in training and is currently seeing results that outperform GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on STEM-focused benchmarks. We’re excited to share more details about it even while it’s still in flight. Read more about the first Llama 4 models, including training and benchmarks ➡️ go.fb.me/gmjohs Download Llama 4 ➡️ go.fb.me/bwwhe9

5,201

Afroz Mohiuddin · Nov 29, 2023 · 5:08 AM UTC

Afroz Mohiuddin @afrozenator

29 Nov 2023

“To get what you want, you have to deserve what you want. The world is not yet a crazy enough place to reward a whole bunch of undeserving people.” — Charlie Munger RIP

Compound248 💰

@compound248

29 Nov 2023

Highlight reel of Charlie Munger spitting one banger after another.

17,554

Afroz Mohiuddin · May 9, 2022 · 12:00 AM UTC

Afroz Mohiuddin @afrozenator

9 May 2022

Neat quotes from technical documentation. "Reusing the same [PRNG] state will cause sadness and monotony, depriving the end user of lifegiving chaos." jax.readthedocs.io/en/latest…

Afroz Mohiuddin · Dec 12, 2020 · 4:07 PM UTC

Afroz Mohiuddin @afrozenator

12 Dec 2020

"I have always found it strange that a stock that falls is seen as risky but a stock that fell (so in the past) becomes an opportunity." - @FromValue in an interview with @InvestmentTalkk

Afroz Mohiuddin · Apr 5, 2025 · 8:51 PM UTC

Afroz Mohiuddin @afrozenator

5 Apr 2025

Really proud to present this model to the world and really excited on what is coming ahead 🔥🦾🚀

Arena.ai

@arena

5 Apr 2025

BREAKING: Meta's Llama 4 Maverick just hit #2 overall - becoming the 4th org to break 1400+ on Arena!🔥 Highlights: - #1 open model, surpassing DeepSeek - Tied #1 in Hard Prompts, Coding, Math, Creative Writing - Huge leap over Llama 3 405B: 1268 → 1417 - #5 under style control Huge congrats to @AIatMeta — and another big win for open-source! 👏 More analysis below⬇️

3,350

Afroz Mohiuddin · Feb 13, 2024 · 6:15 AM UTC

Afroz Mohiuddin @afrozenator

13 Feb 2024

Congratulations to dear friends @YiTayML @PiotrPadlewski @DaniYogatama and everyone at Reka AI for an amazing multimodal model in such a short time! Eagerly looking forward for more awesomeness ahead!

Yi Tay

@YiTayML

12 Feb 2024

We are excited to share Reka Flash ✨, a new state-of-the-art 21B multimodal model that rivals Gemini Pro and GPT 3.5 on key language & vision benchmarks 📈. We've trained this model from scratch and ground zero with a small (but amazingly capable team 🧙‍♂️) and relatively finite resources. We're amazed at how strong it is 🦾. I'm proud of our financially optimal LLM team. Abandoning one's comfort zone is surely difficult and having to redo things from scratch is often scary & daunting. Many things in the wilderness don't work from the get go and it was often a huge pain in the neck 😢. I should write a separate post someday of how much we have we've had to rebuilt (and suffered 🤣). Everything from robust training infra, proper (human) evaluation pipelines and proper RLHF setups. I am thankful of the crazy talented team we have here ☺️. Meanwhile, our largest most capable model Reka-Core is finishing soon and we're already very excited by early results 📈. More to come very soon! 9 months in. Excited to be back at the frontier 🔥. Check out our blogpost here: reka.ai/reka-flash-an-effici…

6,716

Afroz Mohiuddin · Jun 21, 2022 · 11:13 PM UTC

Afroz Mohiuddin @afrozenator

21 Jun 2022

“I never lose. I either win or learn.” — Nelson Mandela

Afroz Mohiuddin · Apr 5, 2025 · 9:00 PM UTC

Afroz Mohiuddin @afrozenator

5 Apr 2025

Replying to @jeremyphoward @JeffDean

Cooking now!

3,699

Afroz Mohiuddin · Aug 11, 2020 · 1:13 AM UTC

Afroz Mohiuddin @afrozenator

11 Aug 2020

#Tensorflow in the #Numpy API - great work by @_agarwal_ashish, Wang Peng, Akshay Modi and team!! Works beautifully with #Trax ! Check it out :-)

François Chollet

@fchollet

10 Aug 2020

New in tf-nightly: the NumPy API. - GPU and TPU-accelerated NumPy code - Interoperable with the rest of the TF ecosystem Documentation: tensorflow.org/api_docs/pyth…

Afroz Mohiuddin · May 18, 2021 · 9:45 PM UTC

Afroz Mohiuddin @afrozenator

18 May 2021

Official Blog - blog.google/products/search/…

Afroz Mohiuddin · Nov 8, 2020 · 3:50 PM UTC

Afroz Mohiuddin @afrozenator

8 Nov 2020

So many gems in here compiled by @polina_marinova The Profile Dossier: Hamdi Ulukaya, the Shepherd-Turned-Billionaire CEO by @ProfileRead theprofile.substack.com/p/th…

Afroz Mohiuddin · Apr 21, 2024 · 4:15 AM UTC

Afroz Mohiuddin @afrozenator

21 Apr 2024

Also reminds of Munger’s maxim of not taking a side on a debate unless you can put the opposite argument better than the best supporter of the said counter argument. (This also led me without an opinion on most topics 🙃)

NS Ramnath

@rmnth

19 Apr 2024

Daniel Dennett (28th March 1942 - 19th April 2024) RIP Dennett on how to criticise wisely

1,489

Afroz Mohiuddin · Jul 17, 2025 · 2:47 AM UTC

Afroz Mohiuddin @afrozenator

17 Jul 2025

Congratulations to the talented team @achowdhery @IrwanBello @real_ioannis

Aakanksha Chowdhery

@achowdhery

16 Jul 2025

Today we launch Asimov. Asimov is our code research agent that is best-in-class in codebase comprehension. It is built for teams, built for enterprises, and built to remember. We use it everyday to accelerate our velocity and streamline distributed ops. Link below to sign up for waitlist.

2,364

Afroz Mohiuddin · May 29, 2021 · 1:52 PM UTC

Afroz Mohiuddin @afrozenator

29 May 2021

"The history of mathematics is a history of horrendously difficult problems being solved by young people too ignorant to know they are impossible" — Freeman Dyson 1/n

Afroz Mohiuddin · May 20, 2023 · 4:45 PM UTC

Afroz Mohiuddin @afrozenator

20 May 2023

A friend quoted that the hardest part about designing a nuclear power plant is how to assign parking spots -- the LLM/DL equivalent of that is the config system :P

2,051

Afroz Mohiuddin · Aug 9, 2025 · 11:14 PM UTC

Afroz Mohiuddin @afrozenator

9 Aug 2025

Replying to @YiTayML

Truly the Bell Labs of its day (possibly still!)

1,832

Afroz Mohiuddin · Apr 30, 2021 · 2:25 PM UTC

Afroz Mohiuddin @afrozenator

30 Apr 2021

Replying to @vitaliyk

"Life is a tragedy for those who feel, but a comedy to those who think."

Afroz Mohiuddin · Oct 23, 2021 · 1:28 AM UTC

Afroz Mohiuddin @afrozenator

23 Oct 2021

Just read @vardi 's thought provoking thoughts "The Sand-Heap Paradox of Privacy and Influence" in the recent CACM. cacm.acm.org/magazines/2021/… The irony of the last sentence "Follow me on ..." and my own post on Twitter is duly noted.

Afroz Mohiuddin · Apr 1, 2022 · 2:47 PM UTC

Afroz Mohiuddin @afrozenator

1 Apr 2022

T5X paper is now on ArXiv! arxiv.org/abs/2203.17189

arXiv Daily

@Arxiv_Daily

1 Apr 2022

Scaling Up Models and Data with and deepai.org/publication/scali… by Adam Roberts et al. including @afrozenator, @ChenZhuo19 #NeuralNetwork #OpenSource

Afroz Mohiuddin · May 29, 2021 · 1:52 PM UTC

Afroz Mohiuddin @afrozenator

29 May 2021

"Von Neumann didn't say anything but after five minutes he raised his hand. When I called on him he went to the blackboard and proceeded to write down the proof. After that I was afraid of von Neumann." "How To Solve It" 2nd ed. (1957), p. xv, George Pólya

Afroz Mohiuddin · May 30, 2023 · 9:48 PM UTC

Afroz Mohiuddin @afrozenator

30 May 2023

Replying to @borisdayma

Data repeats!

4,008

Afroz Mohiuddin · May 29, 2024 · 2:50 AM UTC

Afroz Mohiuddin @afrozenator

29 May 2024

A midday nap has to rank as one of the most understated pleasures in life.

650

Afroz Mohiuddin · Jun 18, 2025 · 6:44 AM UTC

Afroz Mohiuddin @afrozenator

18 Jun 2025

Good guy Geoffrey Hinton dropping some zingers in this interview -- piped.video/watch?v=giT0ytyn… The whole thing is well worth a listen, but here's what struck me: 1/3

Godfather of AI: They Keep Silencing Me But I’m Trying to Warn Them!

He pioneered AI, now he’s warning the world. Godfather of AI Geoffr...

youtube.com

2,158

Afroz Mohiuddin · Jun 30, 2021 · 9:20 AM UTC

Afroz Mohiuddin @afrozenator

30 Jun 2021

It's straightforward to work hard if you have clearly defined, externally imposed goals, as you do in school. ... What I've learned since I was a kid is how to work toward goals that are neither clearly defined nor externally imposed.

Paul Graham

@paulg

29 Jun 2021

How to Work Hard: paulgraham.com/hwh.html

Afroz Mohiuddin · Jun 15, 2022 · 2:13 AM UTC

Afroz Mohiuddin @afrozenator

15 Jun 2022

Very well done folks!

Liam Fedus

@LiamFedus

14 Jun 2022

Today we're releasing all Switch Transformer models in T5X/JAX, including the 1.6T param Switch-C and the 395B param Switch-XXL models. Pleased to have these open-sourced! github.com/google-research/t… All thanks to the efforts of James Lee-Thorp, @ada_rob, and @hwchung27

Afroz Mohiuddin · Nov 9, 2023 · 2:21 AM UTC

Afroz Mohiuddin @afrozenator

9 Nov 2023

Replying to @PiotrPadlewski @m__dehghani @neilhoulsby @_basilM

You’ll be missed that’s for sure! All the best!!

8,039

Afroz Mohiuddin · May 8, 2021 · 3:00 PM UTC

Afroz Mohiuddin @afrozenator

8 May 2021

Present your ideas "To illuminate and not to impress" - a fact that gets lost in too many research papers and presentations.

Rethinking ML Papers @rethinkmlpapers

7 May 2021

Now we have Terence Parr speaking about the role of visualization in ML research

Afroz Mohiuddin · Jan 24, 2023 · 10:34 PM UTC

Afroz Mohiuddin @afrozenator

24 Jan 2023

Fed up with not understanding your Dr's notes? Here's a blog about our recent work on expanding abbreviations from clinical notes that was published in Nature Communications! ai.googleblog.com/2023/01/de… 1/2

Deciphering clinical abbreviations with privacy protecting ML

Posted by Posted by Alvin Rajkomar, Research Scientist, and Eric Loreaux, Software Engineer, Google Research Today many people have digital access ...

research.google

616

Afroz Mohiuddin · Aug 12, 2020 · 8:02 PM UTC

Afroz Mohiuddin @afrozenator

12 Aug 2020

Replying to @AndrewRangeley

You may know this already, but I found Barry Diller's chats with Reid Hoffman interesting in this regard. Their M/O seems to be to start with a fresh take on things, a clean slate -- given that pedigree, this falls into a pattern. Transcript of Part 2 - mastersofscale.com/wp-conten…

Afroz Mohiuddin · Dec 11, 2024 · 5:43 PM UTC

Afroz Mohiuddin @afrozenator

11 Dec 2024

Replying to @_arohan_ @AIatMeta

Wohooooooo!!!!!!! Really looking forward to this!!!

371

Afroz Mohiuddin · Jun 10, 2025 · 2:44 AM UTC

Afroz Mohiuddin @afrozenator

10 Jun 2025

Replying to @_arohan_ @AnthropicAI

Very sad to see you leave, I wish we’d worked more closely together! Thank you for helping push the boundaries while you were here :-) And all the very best for all that lies ahead!

584

Afroz Mohiuddin · Apr 15, 2024 · 8:22 PM UTC

Afroz Mohiuddin @afrozenator

15 Apr 2024

Quite an impressive achievement by friends at @RekaAILabs - congratulations @YiTayML , @PiotrPadlewski @DaniYogatama and all!

Yi Tay

@YiTayML

15 Apr 2024

Our @RekaAILabs Tech Report / Paper is out! 🔥 Tech reports with completely no information are kinda boring so we’re revealing some interesting information on how we train our series of Reka models including tokens, architecture, data & human evaluation workflows. 😃 We tried our best to give a behind-the-scenes experience 😊. In particular, if you enjoyed my previous blog post about training LLMs in the wilderness, there’s a dedicated section on that in this report! 🌴 We can’t disclose literally everything but we tried our best to make it interesting, I promise. 🙏 Here’s a rundown summary of some of the highlights. 🔹Edge and Flash are outrageously strong 7B and 21B models. They are trained on 4.5-5T tokens in total. Also, they have been improved significantly since their first public appearance! They outperform many popular faces. Some data mixture information is in the report. 🔹We discuss our internal human evaluation workflow, prompt distribution, and how we use Core for model development and automatic evaluation. 🔹We describe our infrastructure setup for training large models, quantifying node failures, and report loss curves for training our models. 🔹Aside from the hardware lottery, we also show how this affects node stability across time. Once we were told our cluster became less stable because there were "big guys" moving things around the data center. 😅 On performance which you might have already seen on other threads. 🔹Core approaches frontier-class models like Claude3 Opus and GPT4-V. It outperforms Claude3 Opus on third-party blind human evaluation for multimodal chat, outperforms Gemini Ultra on video QA, and is quite competitive to other frontier models on core text metrics. It also matches GPT4-V on MMMU! 🔹Core ranks #2 on our internal multimodal chat leaderboard, right after GPT4-V. On text, it ranks #3 just behind Claude Opus and GPT4 Turbo. Core outperforms GPT-4 (0613) on this ranking. This has been a focused and concentrated effort of a small team of ~20 people in the past 4 months (yes, we got access to 90%+ of our compute only late December last year! 🚀). This tech report tells our story. Enjoy! Happy to answer any questions in replies or DM! PS: it was nice writing in latex after one whole year! PPS: I had quite some fun writing this 😊. There's some puns and easter eggs and interesting tidbits in there. Trust me. 😏 Link: publications.reka.ai/reka-co…

525

Afroz Mohiuddin · May 24, 2021 · 9:08 PM UTC

Afroz Mohiuddin @afrozenator

24 May 2021

Congratulations to @avitaloliver @anselmlevskaya and others!

Hugging Face

@huggingface

24 May 2021

🔥JAX meets Transformers🔥 @GoogleAI's JAX/Flax library can now be used as Transformers' backbone ML library. JAX/Flax makes distributed training on TPU effortless and highly efficient! 👉 Google Colab: colab.research.google.com/gi… 👉 Runtime evaluation: github.com/huggingface/trans…

Afroz Mohiuddin · Jan 16, 2024 · 9:37 PM UTC

Afroz Mohiuddin @afrozenator

16 Jan 2024

“The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and relearn. ” — Alvin Toffler, Future Shock, 1970 (!)

3,747

Afroz Mohiuddin · Nov 28, 2021 · 4:53 AM UTC

Afroz Mohiuddin @afrozenator

28 Nov 2021

"He saw what people might say, turned it into what they ought to say, and then answered." -- Adam Gopnik on Charles Darwin. themarginalian.org/2015/11/1…

Afroz Mohiuddin · Jun 12, 2025 · 10:59 PM UTC

Afroz Mohiuddin @afrozenator

12 Jun 2025

Replying to @_arohan_

Noam notation is the sanest thing ever …. medium.com/@NoamShazeer/shap… Fyi @moultano

Shape Suffixes — Good Coding Style

If you code neural networks, I believe that this convention can make your life more pleasant. We keep this pretty religiously at…

medium.com

636

Afroz Mohiuddin · Jun 18, 2025 · 6:44 AM UTC

Afroz Mohiuddin @afrozenator

18 Jun 2025

On sticking with your intuitions: - "Don't give up on your intuition until you figure out why it's wrong". - "... isn't going to work if you have bad intuitions, but if you have bad intuitions you're never going to do anything anyway so you might as well stick with them" 2/3

315

Afroz Mohiuddin · May 13, 2025 · 4:23 AM UTC

Afroz Mohiuddin @afrozenator

13 May 2025

“The test of a first-rate intelligence is the ability to hold two opposing ideas in mind at the same time and still retain the ability to function. One should, for example, be able to see that things are hopeless yet be determined to make them otherwise.” F. Scott Fitzgerald

1,106

Afroz Mohiuddin · Feb 5, 2023 · 10:05 PM UTC

Afroz Mohiuddin @afrozenator

5 Feb 2023

Same with TPUs and dimensions need to be well factorizable. Changing sequence length from 229 to 240 made things go 5% faster overall!

Andrej Karpathy

@karpathy

3 Feb 2023

The most dramatic optimization to nanoGPT so far (~25% speedup) is to simply increase vocab size from 50257 to 50304 (nearest multiple of 64). This calculates added useless dimensions but goes down a different kernel path with much higher occupancy. Careful with your Powers of 2.

875

Afroz Mohiuddin · Jun 29, 2021 · 7:13 PM UTC

Afroz Mohiuddin @afrozenator

29 Jun 2021

:-)

Sundar Pichai

@sundarpichai

29 Jun 2021

Using our Multitask Unified Model, or MUM, as introduced at #GoogleIO, we were able to identify 800+ variations of vaccine names in 50+ languages in seconds, making it possible to provide timely, high-quality information about COVID-19 vaccines worldwide. blog.google/products/search/…

Afroz Mohiuddin · Apr 30, 2023 · 8:38 PM UTC

Afroz Mohiuddin @afrozenator

30 Apr 2023

“You think because you understand 'one' you must also understand 'two', because one and one make two. But you must also understand 'and'.” — Rumi

379

Afroz Mohiuddin · Nov 25, 2020 · 1:03 AM UTC

Afroz Mohiuddin @afrozenator

25 Nov 2020

Replying to @srush_nlp

Couldn't agree more! Mesh Tensorflow does a decent job with this. In Noam's own words, once you get habituated to names, you can't go back.

Afroz Mohiuddin · Sep 17, 2022 · 4:36 AM UTC

Afroz Mohiuddin @afrozenator

17 Sep 2022

Many congratulations for the launch folks! @IrwanBello @xpearhead Noam — The VC character is on the money!

Character.AI

@character_ai

16 Sep 2022

We’re excited and proud to be opening up the Character.AI beta to the public! Character lets you create and talk to advanced AI (language tutors, text adventure games, celebrities, talking animals + more).

Afroz Mohiuddin · Sep 5, 2024 · 2:41 AM UTC

Afroz Mohiuddin @afrozenator

5 Sep 2024

“Everything in moderation, including moderation” — Oscar Wilde h/t to @vardi

438

Afroz Mohiuddin · Jan 14, 2023 · 8:00 PM UTC

Afroz Mohiuddin @afrozenator

14 Jan 2023

So surprising that I verified it tinyurl.com/asset-mgmt-irr-a… (python colab). I get: Investor $653K, Manager $1.52M & IRRs: Investor 4.69%, Manager 16.5%* Mind blown at asset manager economics! @mkt_sentiment where do I go wrong? 100K → 2.17M @ 8% for 40y. * some subtlety here

Market Sentiment Tweet - Verification

Colaboratory notebook

colab.research.google.com

Market Sentiment

@mkt_sentiment

14 Jan 2023

At age 25, you give your hedge fund manager $100K to manage, and he produces an annual return of 8%. Assuming a 1.5% management and 20% performance fee, by the time you retire at 65, you will have $764K. But the manager will have $1.24M (at zero initial investment!)

455

Afroz Mohiuddin · Oct 7, 2021 · 12:32 PM UTC

Afroz Mohiuddin @afrozenator

7 Oct 2021

Many a time my life has been enriched by this and in the moment I've gone from "Is that even possible!?" to "Of course! What a great idea!" Lucky to have had good advisors and hoping to pay it forward ...

David Perell

@david_perell

6 Oct 2021

My favorite thing that @TylerCowen has ever written

Afroz Mohiuddin · Mar 20, 2023 · 6:32 AM UTC

Afroz Mohiuddin @afrozenator

20 Mar 2023

Replying to @docmilanfar @saranormous

In a gold rush, it’s not always the prospector find gold, but the people who sell picks and shovels definitely get rich. For a while i think NVIDIA tried this, but i think with MSFT+OAI giving a nice demo for NVDA GPUs, probably it’s why even bother. (1/2)

807

Afroz Mohiuddin · May 20, 2021 · 2:55 AM UTC

Afroz Mohiuddin @afrozenator

20 May 2021

"Attempts at market timing are a source of risk, not protection." — Howard Marks Also a line I'll surely use again ;-) h/t @ScarrottKalani Source: oaktreecapital.com/docs/defa…

Afroz Mohiuddin · Apr 24, 2022 · 7:45 PM UTC

Afroz Mohiuddin @afrozenator

24 Apr 2022

Replying to @salgar

I'm guessing the effects increase with age - I remember being impervious to this as a kid, now even one go at the swing makes me dizzy ...

Afroz Mohiuddin · Jun 14, 2025 · 2:58 AM UTC

Afroz Mohiuddin @afrozenator

14 Jun 2025

Replying to @_arohan_ @AIatMeta @DatHuynh13 @agarwl_

We’ll miss you and your optimism!

1,072

Afroz Mohiuddin · Jul 4, 2025 · 12:51 AM UTC

Afroz Mohiuddin @afrozenator

4 Jul 2025

Replying to @_arohan_

Truly a master :-)

250

Afroz Mohiuddin · May 31, 2023 · 3:23 AM UTC

Afroz Mohiuddin @afrozenator

31 May 2023

Replying to @borisdayma

What kind of a model is this? We usually see this in encoder-decoder models when the decoder learns to use the encoder, large gradients then start flowing to the encoder — have you logged gradient updates? #justCurious

508

Afroz Mohiuddin · May 29, 2021 · 1:52 PM UTC

Afroz Mohiuddin @afrozenator

29 May 2021

Exhibit 2: "There was a seminar for advanced students in Zürich that I was teaching and [John] von Neumann was in the class. I came to a certain theorem, and I said it is not proved and it may be difficult."

Afroz Mohiuddin · Jul 4, 2021 · 5:16 PM UTC

Afroz Mohiuddin @afrozenator

4 Jul 2021

"Without detailed understanding, confidence cannot be attained." — Richard Feynman In "Personal Observations on Reliability of Shuttle" science.ksc.nasa.gov/shuttle…

Afroz Mohiuddin · Jun 16, 2020 · 3:44 PM UTC

Afroz Mohiuddin @afrozenator

16 Jun 2020

Replying to @RishiGosalia @GosaliaRishi

The four "miracle year" papers of Einstein were published while he was a clerk at the Swiss Patent Office. One of them won him the Nobel (Brownian motion) not to say anything about the Special Relativity ones. en.m.wikipedia.org/wiki/Annu…

Afroz Mohiuddin · Oct 3, 2020 · 2:13 PM UTC

Afroz Mohiuddin @afrozenator

3 Oct 2020

"Tell me and I'll forget, show me and I may remember, involve me and I'll understand" -- Chinese proverb

Afroz Mohiuddin · Apr 27, 2024 · 5:11 PM UTC

Afroz Mohiuddin @afrozenator

27 Apr 2024

Friends who conducted live experiments both at Google and Apple were contrasting the stark divergences in attitude and infra for this sort of thing — both stem from what “business” the companies are in — ads and search both need awesome measurement, device/privacy doesn’t

This Post is from an account that no longer exists.

1,112

Afroz Mohiuddin · Aug 17, 2024 · 2:28 AM UTC

Afroz Mohiuddin @afrozenator

17 Aug 2024

This this and just this!

xjdr

@_xjdr

16 Aug 2024

If google ever started selling TPU hardware and released internal tooling, they'd MOG nvidia so bad. Just a trillion dollar company waiting to be built. most people don't realize how good JAX + TPUs + (other stuff) really is.

470

Afroz Mohiuddin · May 26, 2023 · 3:59 AM UTC

Afroz Mohiuddin @afrozenator

26 May 2023

@johnschulman2 predicted as much in piped.video/live/hhiLw5Q_UFg Essentially the argument being that SFT should be on what the base model knows, not on the SFT target label — factuality might go for a toss, and there might be blindspots on the creative stuff as well.

Aran Komatsuzaki

@arankomatsuzaki

26 May 2023

The False Promise of Imitating Proprietary LLMs Open-sourced LLMs are adept at mimicking ChatGPT’s style but not its factuality. There exists a substantial capabilities gap, which requires better base LM. arxiv.org/abs/2305.15717

212

Afroz Mohiuddin · May 20, 2023 · 4:45 PM UTC

Afroz Mohiuddin @afrozenator

20 May 2023

You'd think people would be passionate about languages(pytorch, tf, jax), but it's rather the config systems(gin, config_dict, fiddle, hparams!)!

278

Afroz Mohiuddin · May 15, 2021 · 8:04 PM UTC

Afroz Mohiuddin @afrozenator

15 May 2021

" ... people don't feel they need to have any particular expertise to have opinions about it. All they need is strongly held beliefs, and anyone can have those ..." @paulg was way ahead, in 2009, about why debates about Politics/Religion are uniquely unproductive. 1/2

Afroz Mohiuddin · Apr 13, 2022 · 12:18 PM UTC

Afroz Mohiuddin @afrozenator

13 Apr 2022

Replying to @_arohan_

Really really big spikes 9/10 times seem h/w issues -- We log per step -- on reruns they rarely reoccur (nothing is stochastic in the rerun).

Afroz Mohiuddin · Oct 16, 2020 · 2:36 AM UTC

Afroz Mohiuddin @afrozenator

16 Oct 2020

"It was fun-ner than I thought" - my 8 y/o after her first @TheMathCircle circle on Building Bridges. Great work by Taylor Yeracaris on making it enjoyable for the kids. :-) @avitaloliver @RishiGosalia

Afroz Mohiuddin · Apr 27, 2024 · 8:23 PM UTC

Afroz Mohiuddin @afrozenator

27 Apr 2024

Replying to @jeremyphoward @zem42 @giansegato @Replit

One more signal boost for @amasad !

745

Afroz Mohiuddin · Nov 11, 2020 · 6:28 AM UTC

Afroz Mohiuddin @afrozenator

11 Nov 2020

Experts leading experts. hbr.org/2020/11/how-apple-is… Instead of different business units having their own PnL, have different units be in-charge of a business function and the whole company under one PnL

Afroz Mohiuddin · Sep 17, 2022 · 4:37 AM UTC

Afroz Mohiuddin @afrozenator

17 Sep 2022

@character_ai Congratulations!

Afroz Mohiuddin · Mar 13, 2024 · 12:54 AM UTC

Afroz Mohiuddin @afrozenator

13 Mar 2024

“Writing is nature’s way of letting you know how sloppy your thinking is” — Guindon

323

Afroz Mohiuddin · Jun 8, 2021 · 6:15 PM UTC

Afroz Mohiuddin @afrozenator

8 Jun 2021

Not only that, tools modify how you think, what you consider possible, it can be limiting if you only have a few tools - good tools and good toolmakers are to be treasured!

Afroz Mohiuddin · Feb 13, 2024 · 3:05 PM UTC

Afroz Mohiuddin @afrozenator

13 Feb 2024

I can almost hear Eric saying this in his calm demeanor :-)

Internal Tech Emails

@TechEmails

11 Feb 2024

Google engineer: AI is a serious risk to our business Dec 26, 2018

524

Afroz Mohiuddin · Jun 29, 2022 · 3:19 PM UTC

Afroz Mohiuddin @afrozenator

29 Jun 2022

One related effect of this I’ve seen is that — I rule out hypothesis by how easy they are to rule out, but not by how likely the hypothesis itself is.

This Post is from an account that no longer exists.

Afroz Mohiuddin · Jun 20, 2021 · 1:43 AM UTC

Afroz Mohiuddin @afrozenator

20 Jun 2021

Replying to @_arohan_

In a multihost setup we noticed that *each host was initializing it's own set of parameters* and only the gradient was being averaged across hosts and each host would apply the update. We'd meant to initialize each host from the same Jax rng seed, which wasn't being set o_O

Afroz Mohiuddin · Oct 10, 2024 · 9:00 PM UTC

Afroz Mohiuddin @afrozenator

10 Oct 2024

Congratulations to our field, for getting Nobels in they fields! Solve AGI, and use it to solve everything else No Pressure

313

Afroz Mohiuddin · Nov 17, 2024 · 11:53 PM UTC

Afroz Mohiuddin @afrozenator

17 Nov 2024

LOL, my first instinct was “Model is diverging!”

371

Afroz Mohiuddin · May 24, 2021 · 8:04 PM UTC

Afroz Mohiuddin @afrozenator

24 May 2021

This is apt in more ways than one. In programming these days, I try to make very sure that the code is correct, by writing tests etc (slowing down), before running it. Also useful when correctness is not apparent (ex in Deep Learning code). So going slow helps you to go fast.

This Post is from an account that no longer exists.

Afroz Mohiuddin · Dec 10, 2020 · 2:35 AM UTC

Afroz Mohiuddin @afrozenator

10 Dec 2020

"... we use computer programming in a functional style to encourage clear thinking. Programming forces us to be precise and unambiguous, without forcing us to be excessively rigorous. " (1/2)

Afroz Mohiuddin · Oct 26, 2021 · 10:43 AM UTC

Afroz Mohiuddin @afrozenator

26 Oct 2021

Nice review of Expectations Investing, a very similar discussion can be found at the recent Acquired Podcast episode with @mjmauboussin podcasts.google.com/feed/aHR…

Santangel's Review @SantangelReview

25 Oct 2021

1/ Michael Mauboussin recently gave a great talk to Columbia Business School, discussing the revised and updated version of his book with Alfred Rappaport, Expectations Investing: Reading Stock Prices for Better Returns. We’ll share some of the highlights here

Afroz Mohiuddin · Jul 1, 2021 · 11:12 PM UTC

Afroz Mohiuddin @afrozenator

1 Jul 2021

"Not having experience with many fathers, I didn't realize how remarkable he was." -- Richard Feynman about his father in a characteristically tongue in cheek way. What Do *You* Care What Other People Think?

Afroz Mohiuddin · Mar 18, 2023 · 6:11 AM UTC

Afroz Mohiuddin @afrozenator

18 Mar 2023

“Laziness, impatience, and hubris” — Larry Wall, describing the three virtues of a programmer.

254

Afroz Mohiuddin · Jul 21, 2022 · 8:11 AM UTC

Afroz Mohiuddin @afrozenator

21 Jul 2022

“Do not undertake a program unless the goal is manifestly important and its achievement nearly impossible” — Edwin H Land, inventor & cofounder Polaroid

Afroz Mohiuddin · May 23, 2021 · 2:49 AM UTC

Afroz Mohiuddin @afrozenator

23 May 2021

The 2nd order effect of Charlie Munger's dictum of not forming an opinion till you can argue the opposite case better than their best person -- is that I've now no opinions on a lot of things, perhaps for the better! fs.blog/2013/04/the-work-req… 1/2

Afroz Mohiuddin · May 1, 2021 · 3:09 AM UTC

Afroz Mohiuddin @afrozenator

1 May 2021

If I may add, the only people who I've heard personally say this have an been from - Ivy Leagues and IITs 🤦‍♂️

Josh @JoshuaOgundu

30 Apr 2021

Saying college is a scam to those from poor/working class backgrounds actually does more harm than you all think it does

Afroz Mohiuddin · Sep 4, 2022 · 7:56 PM UTC

Afroz Mohiuddin @afrozenator

4 Sep 2022

Replying to @_arohan_

Name?

Afroz Mohiuddin · Mar 28, 2022 · 3:45 AM UTC

Afroz Mohiuddin @afrozenator

28 Mar 2022

Replying to @_arohan_

I tried, (a few things) worked at XXL scales but at higher (confidential) scales, introduced instabilities. YMMV

Afroz Mohiuddin · Apr 5, 2025 · 8:43 PM UTC

Afroz Mohiuddin @afrozenator

5 Apr 2025

Replying to @MatthewBerman @Ahmad_Al_Dahle

Not yet

164

Afroz Mohiuddin · Nov 19, 2023 · 3:35 PM UTC

Afroz Mohiuddin @afrozenator

19 Nov 2023

Replying to @madiator

Wait for the World Cup match :/

211

Afroz Mohiuddin · Nov 23, 2024 · 2:08 PM UTC

Afroz Mohiuddin @afrozenator

23 Nov 2024

Replying to @menhguin

@astonzhangAZ

121

Afroz Mohiuddin · Oct 7, 2024 · 5:43 PM UTC

Afroz Mohiuddin @afrozenator

7 Oct 2024

"When a company slogan becomes an article of faith, it ceases to be a good slogan." -- @boztank 's law.

276

Afroz Mohiuddin · Jun 29, 2025 · 3:04 PM UTC

Afroz Mohiuddin @afrozenator

29 Jun 2025

"Raffiniert ist der Herrgott, aber boshaft ist er nicht" (God is subtle*, but malicious he is not.) — Albert Einstein * Also translated as: tricky, crafty, shrewd, sophisticated

610

Afroz Mohiuddin · Jun 19, 2022 · 6:29 PM UTC

Afroz Mohiuddin @afrozenator

19 Jun 2022

Replying to @_arohan_

Gell-Mann Amnesia effect

Afroz Mohiuddin · Apr 6, 2025 · 10:12 PM UTC

Afroz Mohiuddin @afrozenator

6 Apr 2025

Coming from you @lukaszkaiser this means a lot 🥰

Lukasz Kaiser

@lukaszkaiser

6 Apr 2025

Congratulations on Maverick, looks like a great model!!

386

Afroz Mohiuddin · Aug 11, 2020 · 4:58 AM UTC

Afroz Mohiuddin @afrozenator

11 Aug 2020

Replying to @andrew_n_carr @AlexLaterre @fchollet @SingularMattrix

APIs aren't too much different -- but you get all TF ecosystem goodies for free! (Ex: SavedModel for TFX etc) Infact, Trax (also by researchers from Google Brain) uses JAX and TensorFlow-Numpy as its backends : trax-ml.readthedocs.io/en/la… cc/ @_agarwal_ashish

Afroz Mohiuddin · Nov 8, 2020 · 3:50 PM UTC

Afroz Mohiuddin @afrozenator

8 Nov 2020

“If you put value to money with planes, apartments, and yachts, and all that kind of stuff, you’ll have a very hard time moving forward, ... If you recognize that money is just a tool, then it will be easy.”

Afroz Mohiuddin · Jan 14, 2025 · 6:21 AM UTC

Afroz Mohiuddin @afrozenator

14 Jan 2025

Replying to @_arohan_ @bhutanisanyam1 @AIatMeta

On his desk, we can raid it together :P

118

Afroz Mohiuddin · Aug 11, 2025 · 1:14 AM UTC

Afroz Mohiuddin @afrozenator

11 Aug 2025

Replying to @teortaxesTex

FWIW a router is standard practice for a bunch of production tasks in Google Search (and has been for a while), “send hard queries to bigger models invoke more complex and expensive backends”

102

Afroz Mohiuddin · Dec 2, 2023 · 7:41 PM UTC

Afroz Mohiuddin @afrozenator

2 Dec 2023

Replying to @trengriffin

I think everyone surprised by that quote has missed the point

108

Afroz Mohiuddin · Jun 20, 2021 · 4:09 PM UTC

Afroz Mohiuddin @afrozenator

20 Jun 2021

Replying to @_arohan_ @eigenhector

Another 'variant' I've seen is a theoretically motivated approach at the beginning with a lot of mathy-ness -- followed by something else which isn't well motivated but works much better.

Afroz Mohiuddin · Aug 23, 2023 · 2:31 AM UTC

Afroz Mohiuddin @afrozenator

23 Aug 2023

“Meena eats Google” has to be one of the most prophetic documents ever written. #ifYouKnowYouKnow

496

Afroz Mohiuddin · Jun 29, 2024 · 1:45 AM UTC

Afroz Mohiuddin @afrozenator

29 Jun 2024

Constraints breed Creativity Another friend put it this way today “You are as inefficient as your profits allow you to be.” h/t @GrangierDavid

David Senra

@FoundersPodcast

27 Jun 2024

Sam Walton: Constraints are your friend

301