Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

Tom Brown

@NotTomBrown

30 Jul 2019

(1/4) Learning ML engineering is a long slog even for legendary hackers like @gdb. IMO, the two hardest parts of ML eng are: 1) Feedback loops are measured in minutes or days in ML (compared to seconds in normal eng) 2) Errors are often silent in ML

Greg Brockman

@gdb

30 Jul 2019

How I became a machine learning practitioner: blog.gregbrockman.com/how-i-… (Spoiler alert: you can too!)

113

479

Tom Brown · Jun 11, 2020 · 4:17 PM UTC

Tom Brown

@NotTomBrown

11 Jun 2020

Training/eval'ing GPT-3 involved a bunch of gnarly distributed system problems (which I love, but are an acquired taste tbh). The API hides those messy details so you can use normal python w/ a tight feedback loop. Gave me the same tingles as switching from TF to pytorch 😊

OpenAI

@OpenAI

11 Jun 2020

We're releasing an API for accessing new AI models developed by OpenAI. You can "program" the API in natural language with just a few examples of your task. See how companies are using the API today, or join our waitlist: beta.openai.com/

Tom Brown · Jan 25, 2017 · 3:32 AM UTC

Tom Brown

@NotTomBrown

25 Jan 2017

An illustration of @OpenAI Universe. Each dot is a task in task space. You can measure the power of an AI by the range of tasks it solves.

100

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

ML dev speed hack #0 - Overfit a single batch - Before doing anything else, verify that your model can memorize the labels for a single batch and quickly bring the loss to zero - This is fast to run, and if the model can't do this, then you know it is broken

Tom Brown · Jul 18, 2020 · 11:30 PM UTC

Tom Brown

@NotTomBrown

18 Jul 2020

My personal experience with GPT-3 is similar to Max's. The model's surprisingly capable, but still has many weaknesses (which we tried our best to point out in the GPT-3 paper). I expect the future to be shiny, but getting there will need a lot of work from the whole community.

Max Woolf @minimaxir

18 Jul 2020

New blog post up: so, you've probably seen all the tweets about GPT-3. GPT-3 is objectively a step forward in the field of AI text-generation, but the current hype on VC Twitter misrepresents the model's current capabilities. GPT-3 isn't magic. minimaxir.com/2020/07/gpt3-e…

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

ML dev speed hack #2 - Assert tensor shapes - Wrong shapes due to silent broadcasting or reduction is an extreme hot spot for silent errors, asserting on shapes (in torch or TF) makes them loud - If you're ever tempted to write shapes in a comment, make an assert instead

Tom Brown · Nov 22, 2024 · 2:50 PM UTC

Tom Brown

@NotTomBrown

22 Nov 2024

Excited to get to work with AWS and Annapurna Labs on optimizing Trainium from silicon to software. Our team’s been having fun going deep into the Neuron stack to get as close as possible to 100% peak theoretical performance.

Anthropic

@AnthropicAI

22 Nov 2024

We're expanding our collaboration with AWS. This includes a new $4 billion investment from Amazon and establishes AWS as our primary cloud and training partner. anthropic.com/news/anthropic…

15,104

Tom Brown · Feb 24, 2025 · 7:13 PM UTC

Tom Brown

@NotTomBrown

24 Feb 2025

We've long had a culture of pair-programming at Anthropic, with one engineer as the Driver and one as the Navigator. It's been interesting to watch Claude rapidly becoming proficient in the Driver role. We're hiring for great Navigators :)

Anthropic

@AnthropicAI

24 Feb 2025

Introducing Claude 3.7 Sonnet: our most intelligent model to date. It's a hybrid reasoning model, producing near-instant responses or extended, step-by-step thinking. One model, two ways to think. We’re also releasing an agentic coding tool: Claude Code.

104

16,792

Tom Brown · Mar 4, 2024 · 6:34 PM UTC

Tom Brown

@NotTomBrown

4 Mar 2024

I love these new models. Excited to see how the world will put them to work.

Anthropic

@AnthropicAI

4 Mar 2024

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

A table of Claude 3 model family benchmarks. Claude 3 Opus, the most capable model, exceeds SOTA across reasoning, math, code, and other evaluations versus GPT-4 and Gemini Ultra.

ALT A table of Claude 3 model family benchmarks. Claude 3 Opus, the most capable model, exceeds SOTA across reasoning, math, code, and other evaluations versus GPT-4 and Gemini Ultra.

5,872

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

(2/4) Most ML people deal with silent errors and slow feedback loops via the "ratchet" approach: 1) Start with known working model 2) Record learning curves on small task (~1min to train) 3) Make a tiny code change 4) Inspect curves 5) Run full training after ~5 tiny changes

Tom Brown · Jul 21, 2020 · 1:06 AM UTC

Tom Brown

@NotTomBrown

21 Jul 2020

This is awesome! Language models do a form of data compression, so they can help people who have limited bandwidth from their bodies due to mobility issues.

Adam V

@AdamVcoding

20 Jul 2020

Typing using only 4 keys is challenging! This is my first go at making a semantic keyboard, which works by guiding a language model to write a text for you. Using GPT-3:

Tom Brown · Jan 10, 2018 · 2:45 AM UTC

Tom Brown

@NotTomBrown

10 Jan 2018

Our work on the Adversarial Patch covered by @BBC. Glad to see mainstream media interested in ML security. Not sure what's going on with that photoshopped toast... bbc.com/news/technology-4255…

Psychedelic toasters fool image recognition tech

Colourful patterns can fool image recognition software, a team of Google researchers suggests.

bbc.com

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

ML dev speed hack #1 - PyTorch over TF - Time to first step is faster b/c no static graph compilation - Easier to get loud errors via assertions within the code - Easier to drop into debugger and inspect tensors (TF2.0 may solve some of these problems but is still raw)

Tom Brown · Apr 29, 2022 · 5:14 PM UTC

Tom Brown

@NotTomBrown

29 Apr 2022

Now seems like a good time to mention that we’re always looking for ways to more efficiently turn raw compute into useful safety research. If you know of great software engineers who are interested in building big machines then have them message me at tom@anthropic.com

Anthropic

@AnthropicAI

29 Apr 2022

We’ve raised $580 million in a Series B. This will help us further develop our research to build usable, reliable AI systems. Find out more: anthropic.com/news/announcem…

Tom Brown · May 28, 2021 · 6:22 PM UTC

Tom Brown

@NotTomBrown

28 May 2021

Excited to share what I've been working on for the last few months! If you're interested in scaling laws and safety (or scaling laws *𝗳𝗼𝗿* safety) then check out our careers page: anthropic.com/#careers

Home \ Anthropic

Anthropic is an AI safety and research company that's working to build reliable, interpretable, and steerable AI systems.

anthropic.com

Tom Brown · Sep 19, 2023 · 5:07 PM UTC

Tom Brown

@NotTomBrown

19 Sep 2023

Immensely proud of this work by the team at Anthropic. Incentives matter, and this sets up the incentive to solve safety problems so that we can scale further.

Anthropic

@AnthropicAI

19 Sep 2023

Today, we’re publishing our Responsible Scaling Policy (RSP) – a series of technical and organizational protocols to help us manage the risks of developing increasingly capable AI systems.

ASL-1: Smaller models. ASL-2: Present large models. ASL-3: Significantly higher risk. ASL-4+: Speculative. Increasing model capability, Increasing security and safety measures.

ALT ASL-1: Smaller models. ASL-2: Present large models. ASL-3: Significantly higher risk. ASL-4+: Speculative. Increasing model capability, Increasing security and safety measures.

14,650

Tom Brown · Jun 20, 2024 · 10:13 PM UTC

Tom Brown

@NotTomBrown

20 Jun 2024

I like this model.

Anthropic

@AnthropicAI

20 Jun 2024

Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free: claude.ai

Benchmark table showing Claude 3.5 Sonnet outperforming (as indicated by green highlights) other AI models on graduate level reasoning, code, multilingual math, reasoning over text, and more evaluations. Models compared include Claude 3 Opus, GPT-4o, Gemini 1.5 Pro, and Llama-400b.

ALT Benchmark table showing Claude 3.5 Sonnet outperforming (as indicated by green highlights) other AI models on graduate level reasoning, code, multilingual math, reasoning over text, and more evaluations. Models compared include Claude 3 Opus, GPT-4o, Gemini 1.5 Pro, and Llama-400b.

5,260

Tom Brown · May 29, 2020 · 2:04 AM UTC

Tom Brown

@NotTomBrown

29 May 2020

I encourage y’all to read (or at least skim) the paper. I’m really proud to have had a part in creating this work over the last 18 months and am glad to get to share it with you. Paper: arxiv.org/abs/2005.14165 Samples & Data: github.com/openai/gpt-3 (12/12)

Language Models are Few-Shot Learners

Recent work has demonstrated substantial gains on many NLP tasks and benchmarks by pre-training on a large corpus of text followed by fine-tuning on a specific task. While typically task-agnostic...

arxiv.org

Tom Brown · Feb 27, 2018 · 2:09 AM UTC

Tom Brown

@NotTomBrown

27 Feb 2018

Our new paper: "Is Generator Conditioning Causally Related to GAN Performance?" TLDR: "Almost certainly" arxiv.org/abs/1802.08768

Tom Brown · Aug 1, 2019 · 4:27 PM UTC

Tom Brown

@NotTomBrown

1 Aug 2019

Learning Day! Today I'll be learning about GPU kernel programming by going through the Numba tutorials and writing some CUDA kernels from scratch. 🍿🌽🌰 <= my kernels

Jack Clark

@jackclarkSF

1 Aug 2019

We've recently rolled out Learning Day on OpenAI's policy team and it's wonderful. Today I'll be reading a book on tech transfer initiatives between West and USSR during the 20th century. Ask me about the Gorky autoplant for a good time, comrades!

Tom Brown · May 29, 2020 · 8:50 PM UTC

Tom Brown

@NotTomBrown

29 May 2020

Wanted to give credit to @colinraffel for this excellent summary thread for T5. I really appreciate having an overview before diving into the nitty gritty of a paper, and I used this as inspiration to do my own summary thread yesterday.

This Post is from an account that no longer exists.

Tom Brown · Dec 27, 2019 · 9:44 PM UTC

Tom Brown

@NotTomBrown

27 Dec 2019

I now suspect that I have worked with several spies

Tom Brown · Jun 10, 2020 · 9:51 PM UTC

Tom Brown

@NotTomBrown

10 Jun 2020

2^42 = 4.398 Trillion Math checks out 👌✨

Geoffrey Hinton

@geoffreyhinton

10 Jun 2020

Extrapolating the spectacular performance of GPT3 into the future suggests that the answer to life, the universe and everything is just 4.398 trillion parameters.

Tom Brown · Aug 24, 2018 · 10:38 PM UTC

Tom Brown

@NotTomBrown

24 Aug 2018

Everybody dance now! Great execution on this pose-transfer GAN project from Caroline Chan et al. at @berkeley_ai arxiv.org/pdf/1808.07371.pdf Their paper is quite simple and easy to follow. I'll mention three cool tricks that they used to get their results.

Tom Brown · Oct 5, 2023 · 6:53 PM UTC

Tom Brown

@NotTomBrown

5 Oct 2023

Progress on making the "inscrutable matrices" inside of Transformers more understandable! It seems like this technique is now "shovel ready" for engineers who want to work on scaling it up on frontier LLMs.

Chris Olah

@ch402

5 Oct 2023

If you'd asked me a year ago, superposition would have been by far the reason I was most worried that mechanistic interpretability would hit a dead end. I'm now very optimistic. I'd go as far as saying it's now primarily an engineering problem -- hard, but less fundamental risk.

4,986

Tom Brown · Dec 20, 2024 · 3:49 PM UTC

Tom Brown

@NotTomBrown

20 Dec 2024

Really enjoyed this convo about our journey so far and where we're headed. Especially fun reminiscing about the early days and the origin stories for each of us.

Anthropic

@AnthropicAI

20 Dec 2024

Our co-founders discuss the past, present, and future of Anthropic. Timestamps: 00:00 Why work on AI? 02:08 Scaling breakthroughs 10:57 Sentiment shifting 18:30 The Responsible Scaling Policy 30:42 Founding story 39:08 Racing to the top 43:43 Looking to the future

6,742

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

ML dev speed hack #4 - Use ipdb.set_trace() - It's hard to make an ML job take less than 10 seconds to start, which is too slow to maintain flow - Using the ipdb workflow lets you zero in on a bug and play with tensors with a fast feedback loop

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

ML dev speed hack #3 - Add ML test to CI - If more than one entrypoint or more than one person working on the codebase, then add a test that runs for N steps and then checks loss - If you only have one person and entrypoint then an ML test in CI is probably overkill

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

ML dev speed hack #5 - Use nvvp to debug throughput - ML throughput (step time) is one place where we have the tools to make errors loud and feedback fast - You can use torch.cuda.nvtx.range_push to annotate the nvvp timeline to be more readable

Tom Brown · Sep 13, 2020 · 6:37 PM UTC

Tom Brown

@NotTomBrown

13 Sep 2020

Replying to @jackclarkSF

I wonder what might be "practicing the scales" for ML research engineering? - write minimal versions of core papers (xformer, gan, vae) - profile them and make faster - write core ops (like xent-loss) in pytorch and make them numerically stable - write simple GPU ops

Tom Brown · Sep 13, 2018 · 6:22 PM UTC

Tom Brown

@NotTomBrown

13 Sep 2018

IMO, ML researchers outside of the defense community underestimate how hard adversarial examples will be to solve. Two reasons: 1) Research runs at the speed of the conference cycle. A broken defense gets accepted to NIPS, but the community doesn't know it's broken until ICML.

Tom Brown · Feb 23, 2018 · 2:36 AM UTC

Tom Brown

@NotTomBrown

23 Feb 2018

Adversarial examples for the human visual system: An image that when viewed for a fraction of second looks like X, but on reflection you realize it's Y.

Ian Goodfellow

@goodfellow_ian

23 Feb 2018

Adversarial examples that fool both human and computer vision arxiv.org/abs/1802.08195

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

(4/4) Within the ratchet approach, I want more tools and best practices for making feedback loops shorter and for making errors louder. Below is a short list of development speed hacks that I have found useful.

Tom Brown · Jun 13, 2017 · 4:21 PM UTC

Tom Brown

@NotTomBrown

13 Jun 2017

So many flips on my twitter feed 🙃 Proud to be involved in the safety collaboration between @OpenAI and @DeepMindAI blog.openai.com/deep-reinfor…

Tom Brown · Sep 13, 2018 · 6:22 PM UTC

Tom Brown

@NotTomBrown

13 Sep 2018

Interested in building the first ML model that can reliably distinguish between birds and bicycles? Take a look at our contest: github.com/google/unrestrict…

GitHub - coeff-giving/unrestricted-adversarial-examples: Contest Proposal and infrastructure for...

Contest Proposal and infrastructure for the Unrestricted Adversarial Examples Challenge - coeff-giving/unrestricted-adversarial-examples

github.com

Tom Brown · Apr 30, 2020 · 5:23 PM UTC

Tom Brown

@NotTomBrown

30 Apr 2020

Listening to this singing neural net over the last few months has dramatically increased my appreciation for country music 🤠

OpenAI

@OpenAI

30 Apr 2020

Introducing Jukebox, a neural net that generates music, including rudimentary singing, as raw audio in a variety of genres and artist styles. We're releasing a tool for everyone to explore the generated samples, as well as the model and code: openai.com/blog/jukebox/

Tom Brown · Nov 23, 2019 · 10:12 PM UTC

Tom Brown

@NotTomBrown

23 Nov 2019

I was lucky enough to work with Colin as I was starting out in ML research. For anyone who's interested in doing a PhD @unccs, I highly recommend talking with him. Congrats, Prof Raffel!

This Post is from an account that no longer exists.

Tom Brown · Aug 1, 2019 · 4:07 PM UTC

Tom Brown

@NotTomBrown

1 Aug 2019

Learning Day rules!! 📚🍎 People post what they're learning in Slack, and if there are others who are interested then we can learn together. We often learn faster together because each person knows a different piece of the full story.

OpenAI

@OpenAI

1 Aug 2019

Each Thursday at OpenAI is Learning Day: a day where employees have the option to self-study technical skills that will make them better at their job but which aren't being learned from daily work. Here's how it works: openai.com/blog/learning-day…

Tom Brown · Jun 14, 2018 · 12:41 AM UTC

Tom Brown

@NotTomBrown

14 Jun 2018

I know I’m like ten years late to the party, but guys, Anki is really good! Breaking down tough ideas into tiny questions means that I notice which parts I’m confused about.

Tom Brown · Aug 6, 2024 · 3:31 AM UTC

Tom Brown

@NotTomBrown

6 Aug 2024

Replying to @johnschulman2

Excited to get to work with you again, Joschu!

3,276

Tom Brown · Sep 13, 2018 · 6:22 PM UTC

Tom Brown

@NotTomBrown

13 Sep 2018

I'm still optimistic that there will be a solution to our contest. Three reasons: 1) Ensembles of humans make robust decisions on simple tasks 2) There are lots of things to try, and I've been consistently surprised by the speed of ML progress 3) This task is really really easy

Tom Brown · Oct 15, 2019 · 5:16 PM UTC

Tom Brown

@NotTomBrown

15 Oct 2019

This stuffed-giraffe-resistant hand from my roboticist friends at @openai brings joy to my heart. Problem: Deep learning has trouble generalizing outside the training set. Solution: Put WAY MORE STUFF in the training set.

OpenAI

@OpenAI

15 Oct 2019

Replying to @OpenAI

We’re all used to robots that fail when their environment changes unpredictably. Our robotic system is adaptable enough to handle unexpected situations not seen during training, such as being prodded by a stuffed giraffe:

Tom Brown · Jul 30, 2019 · 11:56 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

cc tweeps who build stuff quickly: @karpathy, @catherineols, @jeremyphoward, @Thom_Wolf, @hardmaru, @goodfellow_ian, @soumithchintala, @D_Berthelot_ML, @josh_tobin_ , @mcleavey and @AlecRad.

Tom Brown · Jul 30, 2019 · 11:45 PM UTC

Tom Brown

@NotTomBrown

30 Jul 2019

Curious what other folks recommend for speeding up ML development feedback loops and for making errors louder.

Tom Brown · Jul 20, 2020 · 10:07 PM UTC

Tom Brown

@NotTomBrown

20 Jul 2020

Replying to @gwern @dileeplearning @MelMitchell1

Cool, thanks for sharing. Makes perfect sense that BPE would mess up these tasks. Can't believe we missed running these experiments for the paper!

Tom Brown · Mar 29, 2016 · 11:32 PM UTC

Tom Brown

@NotTomBrown

29 Mar 2016

Just released a little app. meetanotherday.com - A fake meeting scheduler for people with too many real meetings

Tom Brown · May 5, 2020 · 4:49 PM UTC

Tom Brown

@NotTomBrown

5 May 2020

New study on algorithmic efficiency trends by @Hernandez_Danny My bet is that this trend will keep up for at least three more years on ImageNet. That means that in 2023 it will take 250x (!) less compute to train to AlexNet level than it took in 2012. (1/2)

OpenAI

@OpenAI

5 May 2020

Since 2012, the amount of compute for training to AlexNet-level performance on ImageNet has been decreasing exponentially — halving every 16 months, in total a 44x improvement. By contrast, Moore's Law would only have yielded an 11x cost improvement: openai.com/blog/ai-and-effic…

Tom Brown · Sep 13, 2018 · 6:22 PM UTC

Tom Brown

@NotTomBrown

13 Sep 2018

2) Some defenses do well against a *specific attack* (like small perturbations), but they don't generalize to other threat models

Tom Brown · Aug 3, 2017 · 4:44 PM UTC

Tom Brown

@NotTomBrown

3 Aug 2017

My favorite behavior from working on github.com/nottombrown/rl-te… - The noodle ballerina 🍝💃 Training care of noodle coach @raelifin

Tom Brown · May 17, 2018 · 5:49 PM UTC

Tom Brown

@NotTomBrown

17 May 2018

AI friends and AI skeptics of twitter - Will there be an AI project that uses 100x more compute than AlphaGo Zero within the next three years?

71% Yes (70% confidence)

13% No (70% confidence)

16% Maybe

289 votes • Final results

Tom Brown · Feb 15, 2019 · 12:08 AM UTC

Tom Brown

@NotTomBrown

15 Feb 2019

Great work from some of my colleagues at OpenAI! I’m glad that they’ve stayed true to the Charter and are being careful not to release dual-use technology without giving our institutions some time to react and adapt.

OpenAI

@OpenAI

14 Feb 2019

We've trained an unsupervised language model that can generate coherent paragraphs and perform rudimentary reading comprehension, machine translation, question answering, and summarization — all without task-specific training: blog.openai.com/better-langu…

Tom Brown · Jul 8, 2020 · 2:02 AM UTC

Tom Brown

@NotTomBrown

8 Jul 2020

Replying to @lukeprog

Have you seen this plotted as an "expected vaccine date" with error bars? If not then I could potentially scrape and re-plot it

Tom Brown · Nov 22, 2024 · 3:45 PM UTC

Tom Brown

@NotTomBrown

22 Nov 2024

And a special thanks to @diamant_ron for the time spent walking me through the system architecture, chip design, software stack, ISA, etc. You're unique in not only being both an expert on chip design, but also an ML practitioner who understands the option spaces that we're exploring. Really grateful to get to work with you!

2,435

Tom Brown · Oct 15, 2019 · 6:02 PM UTC

Tom Brown

@NotTomBrown

15 Oct 2019

No giraffes were in the training set for this hand. The model was just trained on TONS of random environments, so instead of memorizing solutions to specific envs, the easiest things for the model to do was to figure out a general strategy (which included giraffe resistance!).

OpenAI

@OpenAI

15 Oct 2019

Replying to @OpenAI

Tom Brown · Jul 23, 2020 · 4:13 AM UTC

Tom Brown

@NotTomBrown

23 Jul 2020

Replying to @stewfortier

I've also been surprised by this. My theory is that it's quite good at playing the straight man, so it works as a good comic foil to outlandish prompts.

Tom Brown · May 31, 2018 · 7:21 PM UTC

Tom Brown

@NotTomBrown

31 May 2018

Allstar class of AI fellows from @open_phil with a wide range of study (provable defenses to adversaries, safe exploration in RL, language understanding, strategies of conflict, and interpretability). Looking forward to seeing how each of these students pushes forward their field

Coefficient Giving

@coeff_giving

31 May 2018

Excited to announce our first class of AI Fellows, seven machine learning students to whom we’re collectively recommending $1.1 million in PhD fellowship support over the next five years: openphilanthropy.org/focus/g…

Tom Brown · May 17, 2018 · 5:49 PM UTC

Tom Brown

@NotTomBrown

17 May 2018

I love this post because it helps operationalize one of the big disagreements between AI optimists and pessimists: Will we continue to see bigger and bigger compute projects or will the AI hype bubble collapse? blog.openai.com/ai-and-compu…

Tom Brown · Jun 11, 2020 · 7:49 PM UTC

Tom Brown

@NotTomBrown

11 Jun 2020

OpenAI has a program that is offering free access to the API to academic researchers: forms.office.com/Pages/Respo…

Tom Brown · Aug 10, 2018 · 1:53 AM UTC

Tom Brown

@NotTomBrown

10 Aug 2018

Nerding out over this colorful network architecture diagram from @OpenAI Five. s3-us-west-2.amazonaws.com/o…

Tom Brown · May 16, 2018 · 4:45 AM UTC

Tom Brown

@NotTomBrown

16 May 2018

These comics by the talented @sh_reya are 👌✨ @waitbutwhy needs to watch the stick figure throne.

Shreya Shankar

@sh_reya

11 May 2018

hello Twitter, I present a fun intro to AI safety! these comics took longer than I thought, so I'm posting half the series today & the second half on Monday. let me know what you think! or if you have any other ideas :-)

Tom Brown · Aug 13, 2020 · 3:03 PM UTC

Tom Brown

@NotTomBrown

13 Aug 2020

Replying to @Julian

"compute"

Tom Brown · Jun 13, 2018 · 6:05 AM UTC

Tom Brown

@NotTomBrown

13 Jun 2018

Awesome guide to InfoGAN by @avitaloliver - I especially like that it includes exercises for the non-lazy reader.

Depth First Learning @DepthFirstLearn

6 Jun 2018

Replying to @DepthFirstLearn

Today we released our first guide, about InfoGAN: depthfirstlearning.com/2018/… Stay tuned, there's more to come!

Tom Brown · Aug 4, 2020 · 6:25 PM UTC

Tom Brown

@NotTomBrown

4 Aug 2020

We teach our computer children to draw. We teach them by making them fight each other.

You’re unable to view this Post because this account owner limits who can view their Posts.

Tom Brown · Aug 24, 2018 · 10:38 PM UTC

Tom Brown

@NotTomBrown

24 Aug 2018

Trick #3: They add a face-specific GAN to touch up the face after the main generation is finished. They include an ablation study and it looks like it helps substantially.

Tom Brown · Oct 28, 2024 · 11:34 PM UTC

Tom Brown

@NotTomBrown

28 Oct 2024

Replying to @Mononofu @AnthropicAI @GoogleDeepMind

Really excited to work with you, Julian!

848

Tom Brown · Feb 24, 2025 · 7:35 PM UTC

Tom Brown

@NotTomBrown

24 Feb 2025

Navigation tips from @catherineols

Catherine Olsson

@catherineols

24 Feb 2025

Claude Code is very useful, but it can still get confused. A few quick tips from my experience coding with it at Anthropic 👉 1) Work from a clean commit so it's easy to reset all the changes. Often I want to back up and explain it from scratch a different way.

7,956

Tom Brown · Aug 3, 2017 · 4:49 PM UTC

Tom Brown

@NotTomBrown

3 Aug 2017

RL-Teacher talent-show runner-up: Noodle Dog Conductor

Tom Brown · Jul 24, 2020 · 1:58 AM UTC

Tom Brown

@NotTomBrown

24 Jul 2020

Replying to @gwern

You should add the meme.

Tom Brown · May 29, 2020 · 2:04 AM UTC

Tom Brown

@NotTomBrown

29 May 2020

Finally, we look at some ways where these models might go wrong. We look at the potential for misuse and study the biases of the model. My personal hope is that by studying these current weaknesses, we can develop solutions that will scale with more powerful systems. (9/12)

Tom Brown · Oct 5, 2019 · 6:20 AM UTC

Tom Brown

@NotTomBrown

5 Oct 2019

Replying to @AnjneyMidha @Atellani

Awesome! Best portrait scan that I've seen so far!

Tom Brown · Mar 29, 2020 · 6:18 PM UTC

Tom Brown

@NotTomBrown

29 Mar 2020

Shout out to all y’all producing OSS, research papers, blog posts and other public goods! You don’t show up in GDP, but you show up in our hearts ❤️

Michelle Valentine

@_vltn

29 Mar 2020

Realization of the day: progress in open source software does not show up in GDP figures because it isn't bought or sold. It might indirectly increase GDP because of improved productivity of companies that consume it, or related donations / managed services.

Tom Brown · Aug 24, 2018 · 10:38 PM UTC

Tom Brown

@NotTomBrown

24 Aug 2018

Trick #1: They want their poses to be aligned, so they scale and translate the source pose to match the target. (Also note that they start with a Pix2PixHD setup, so in addition to the normal GAN loss, they have an autoencoder loss in VGG feature space)

Tom Brown · Sep 6, 2020 · 12:49 AM UTC

Tom Brown

@NotTomBrown

6 Sep 2020

Replying to @christinahkim @arram @8enmann @jasoncbenn

Oh my gourd

Tom Brown · May 5, 2020 · 7:50 PM UTC

Tom Brown

@NotTomBrown

5 May 2020

Replying to @girishsastry

The 1.5x is a rule of thumb I took from @robinhanson It might come from combining both: 1) All else equal, we should find ourselves near the middle of a trend (so it should go on for 2x) 2) We tend to only notice trends that have been going for a while (so adjust down to 1.5x)