Nando de Freitas · Sep 7, 2024 · 9:31 AM UTC

Nando de Freitas

Nando de Freitas

@NandoDF

7 Sep 2024

The Llama 3 paper is a must-read for anyone in AI and CS. It’s an absolutely accurate and authoritative take on what it takes to build a leading LLM, the tech behind ChatGPT, Gemini, Copilot, and others. The AI part might seem small in comparison to the gargantuan work on *data* and *scale engineering*. I hope professors in distributed systems, high performance computing, algorithms, databases, HCI, etc use it as an example of bleeding edge CS in their classes. So many exciting open problems! @UBC_CS @CompSciOxford @berkeley_ai @Cambridge_Eng @WitsUniversity @NSERC_CRSNG @NSF @ERC_Research @UKRI_News

Soumith Chintala

@soumithchintala

23 Jul 2024

Why do 16k GPU jobs fail? The Llama3 paper has many cool details -- but notably, has a huge infrastructure section that covers how we parallelize, keep things reliable, etc. We hit an overall 90% effective-training-time. ai.meta.com/research/publica…

281

1,787

160,310

Nando de Freitas · Aug 17, 2024 · 11:54 AM UTC

Nando de Freitas

@NandoDF

17 Aug 2024

Hmmm, from what I see my colleagues in AI at Google London work bloody long ours and are extremely committed. This guy once came to London and told us to abandon Torch and use TensorFlow. That set the field of AI back by at least 6 months.

This Post is from an account that no longer exists.

2,421

415,446

Nando de Freitas · Oct 19, 2024 · 2:20 PM UTC

Nando de Freitas

@NandoDF

19 Oct 2024

I’ve walked through poor neighbourhoods in India, Africa and LatAm many times. Yet, I recently walked through one of the most depressing ones in terms of poverty, drug abuse, and sheer hopelessness: San Francisco. Giant tech AI companies promise to make the world a better place, but their backyard is in a deplorable human state. Self-driving Waymo Jaguar luxury cabs roam through a city full of homeless people on the sidewalks. If we can’t fix this, what hope do we have for the future? It’s time for ALL Big Tech to take this more seriously. Excuses like weather, etc, are rather mediocre explanations. Also, it’s beyond excuses. Maybe time for big tech and the many million dollar startups to demonstrate some responsibility, and illustrate the values, which they enforce on employees, by example. Do the right thing. Maybe a conference in SF with all big tech CEOs, government bodies, and a few people from the streets could be a good start. A commitment to solve the problem is the first step. That would restore hope. From Google images:

380

124

1,821

683,183

Nando de Freitas · Sep 9, 2019 · 8:21 AM UTC

Nando de Freitas

@NandoDF

9 Sep 2019

I believe I have written more papers than Alan Turing + John Nash! Numbers of papers alone is a wrong misleading metric. Please focus instead on writing good papers that advance the field, help the world, and that you’ll be proud of when you look back in 20 or 50 years.

Stanford NLP Group

@stanfordnlp

8 Sep 2019

Yes, @GoogleAI (well, all of @AlphabetINC) produces a lot of awesome AI research, but @Stanford + @MIT together produce more (judging by @NeurIPSConf papers!), and @Stanford + @MIT + @UCBerkeley + @CarnegieMellon produces more than @AlphabetINC + @Microsoft + @facebook

359

1,774

Nando de Freitas · Sep 13, 2024 · 11:14 AM UTC

Nando de Freitas

@NandoDF

13 Sep 2024

It’s time to say thank you and goodbye to @GoogleDeepMind. I had the immense fortune of working there for 10 years. They were undoubtedly the most exciting years in the history of AI, and I feel that I grew beyond all my expectations thanks to my uniquely smart, generous and helpful colleagues. DeepMind has been the epicentre of AI in terms of innovation, but it is also the place from which notable researchers left to found @OpenAI @MistralAI @xai @udiomusic @inflectionAI and more. In fact, ex-DeepMinders are at the heart of most successful AI companies, including @AnthropicAI @cohere and more. I believe that no organisation has been more influential in technological innovation since Xerox Parc. DeepMind has truly made history and created a new future. At DeepMind I never felt alone. My first manager @demishassabis was an enormous source of inspiration, scientific freedom, and caring support. I will never forget all the support I received from him and Helen King when I was going through very difficult personal loss ❤️ I really hope Demis, John and team get their much deserved Nobel Prize soon. They strongly deserve it. I am so proud and thankful for having been part of the ML team. You gave me so much happiness and made the dreams become reality. I learned so much from you. I also learned so much from my generous colleagues all over the organisation, from the AlphaCode team, the AlphaGo team, and so many more. Thank you 🙏 I also must thank the AVS and GenMedia teams. I treasured being able to serve you. You achieved so much in such a short time: Lyria, Imagen, Veo and more. I always looked up to most of you, and I can’t wait to see the amazing things you’ll produce over the next few months and years. You are truly exceptional, talented, hard working and generous. I was very lucky to have been part of your teams. Thank you 🙏 I am very sad, and admittedly even a bit tearful as I write this, but as I posted recently “In order to grow and to improve you have to be there a bit at the edge of uncertainty.” (Mallman). It is time for me to embrace a bit of discomfort and a new episode. Love you all DeepMinders! Good luck and thank you ❤️

1,453

129,475

Nando de Freitas · Dec 19, 2020 · 10:48 PM UTC

Nando de Freitas

@NandoDF

19 Dec 2020

Can AI researchers please tweet: I am against racism, sexism, bullying and cancelling, and I believe in improving diversity, equity and inclusion in our AI community. We need to hear your voices! The students and the public need to know what most believe.

327

1,355

Nando de Freitas · Apr 20, 2025 · 4:28 PM UTC

Nando de Freitas

@NandoDF

20 Apr 2025

RL is not all you need, nor attention nor Bayesianism nor free energy minimisation, nor an age of first person experience. Such statements are propaganda. You need thousands of people working hard on data pipelines, scaling infrastructure, HPC, apps with feedback to drive benchmarks and data, tons of research and engineering on generative models, data mixtures, ablations, RL/selftraining, etc etc and we will probably need lots of people working hard to figure out safety, causal world models, awareness, models that create abstractions comparable to infinity and zero and use these to predict the existence of things like black holes and suggest experiments to verify such hypothesis, or come up with novel engineering designs to generate energy more efficiently, robotics, etc etc. It takes thousands of people and many ideas. In the end some simple ideas might become obvious but such obviousness only happens in retrospect. Yes, there is a bitter lesson but if we had followed it, we’d still be doing linear regression with RL. Let’s not oversimplify, but rather honour the research and engineering of thousands of people. Also, people keep rewriting history. When our language understanding start up (darkbluelabs) was acquired by Google about 10 years ago, we joined DeepMind, where the AGI documents were all about concepts, RL, episodic memories and made it clear that there was no room for language. To be honest, back then such a position wasn’t so crazy. Now it seems silly, but only because of the benefit of hindsight. There’s no 1 or 10 heroes in the history of AI. There’s many 1000s of hard working students, profs, engineers, operations and support people, product folks, managers, even hedge funds among others. Let’s honour the whole community and not just ceos or the philosophers of Bayes, RL, deep learning, etc. I look forward to learning from the next generation and seeing what they will achieve. To them: Don’t buy the existing narratives blindly, innovate. Remember that just like mathematics, AI will advance one grave at the time.

193

1,395

114,333

Nando de Freitas · Nov 12, 2025 · 9:52 AM UTC

Nando de Freitas

@NandoDF

12 Nov 2025

I still remember that 2013 @NeurIPSConf party with Mark Zuckerberg. He had a bottle of water at that first Neurips corporate party. I thought it was out of character for a Neurips party - what was the matter with this kid? And why did he speak like that? We were so naive! … but it turns out they were even more naive that us. Had they been smart, they could have hired ALL of the Neurips scientists, perhaps minus a small British startup called DeepMind, which was sought after by Google, at that party. Instead, they hired only a small group - but that group included one of the greatest and most influential engineers of all time: @ylecun. Would I have said yes to a 500K offer? Hell yes!! As a (high-paid) Oxford prof I made 85K and was struggling to get a mortgage for my growing family. AI people until then didn’t do it for the money. Now we hear about people making 10 and 20 million per year - still nothing comparable to what the corporate executives make. That party changed everything. I’m happy to see Yann moving on to a new chapter. He’s so creative. I’m looking forward to seeing what he does next 🙂 When I taught my course at Oxford that year, explaining why neural networks were modular like Lego and how to use automatic differentiation (backprop) to get global consistency from local messages, it became super popular in YouTube. I thought my students and I were the first to teach this amazing generality with Torch. It turns out Yann had done it before - it just wasn’t as easy to find. Yann was not only a pioneer of convnets but also the software we all use to this very day. We owe him a lot, a lot more than most realise.

1,237

274,958

Nando de Freitas · Jul 7, 2019 · 10:09 PM UTC

Nando de Freitas

@NandoDF

7 Jul 2019

View point invariance is an important inductive bias in how we perceive objects - here tested to the limit by a smart artist.

This tweet is unavailable

313

1,070

Nando de Freitas · Mar 25, 2024 · 11:16 AM UTC

Nando de Freitas

@NandoDF

25 Mar 2024

There appears to be a mismatch between publishing criteria in AI conferences and "what actually works". It is easy to publish new mathematical constructs (e.g. new models, new layers, new modules, new losses), but as Apple's MM1 paper concludes: 1. Encoder Lesson: Image resolution has the highest impact, followed by model size and training data composition. 2. Vision-Language (VL) Connector Lesson: Number of visual tokens and image resolution matters most, while the type of VL connector has little effect. 3. Data Lesson 1: Interleaved data is instrumental for few-shot and text only performance, while captioning data lifts zero-shot performance. 4. Data Lesson 2: Text-only data helps with few-shot and text-only performance. 5. Data Lesson 3: Careful mixture of image and text data can yield optimal multimodal performance and retain strong text performance. 6. Data Lesson 4: Synthetic (caption) data helps with few-shot learning. I suspect it would be very hard to publish a paper that says "we made the images bigger and got better results". There is great value in careful ablations, scaling studies (as this paper shows when determining learning rates), data science, data engineering and engineering in general. Huge work goes into constructing the data pipelines. Engineering for a long time had a poor reputation in AI conferences - many papers were rejected with "this is just engineering! ". In the end, just engineering is working beautifully. There is room for scientific invention, but just engineering also allows for an incredible amount of innovation. arxiv.org/pdf/2403.09611.pdf

188

1,005

312,746

Nando de Freitas · Apr 3, 2021 · 10:34 AM UTC

Nando de Freitas

@NandoDF

3 Apr 2021

Mike Jordan defending ML engineering! Good engineering gave us the GPU convnets, the transformers, torch, numpy, etc. The popular diminishing meme “it’s-just-engineering” is silly, and holds us back. I ❤️ creative, rigorous, robust, safe engineering. flip.it/epDjxP

Stop Calling Everything AI, Machine-Learning Pioneer Says

THE INSTITUTE Artificial-intelligence systems are nowhere near advanced enough to replace humans in many tasks involving reasoning, real-world knowledge, and social interaction. They are showing …

flip.it

145

976

Nando de Freitas · Sep 16, 2024 · 4:09 PM UTC

Nando de Freitas

@NandoDF

16 Sep 2024

I’ve joined @Microsoft AI to advance the frontier of large scale multimodal AI research and to build products for people to achieve meaningful goals and dreams. The MAI team is small, but well resourced and ambitious. We are now looking for exceptional ICs, who like to ship. If you you’re interested in multimodal AI, both recognition and generation, love to collaborate and empower others, believe in diversity and inclusion, have a growth mindset, and want to impact the future of AI in a positive and profound way, please message me directly. I believe this is a rare and unique opportunity to join a new AI team that will shape the future. @black_in_ai @WiMLworkshop @_LXAI

955

108,657

Nando de Freitas · May 9, 2022 · 9:11 AM UTC

Nando de Freitas

@NandoDF

9 May 2022

Game over. Scale is essential to AI.

862

Nando de Freitas · Feb 4, 2023 · 1:05 PM UTC

Nando de Freitas

@NandoDF

4 Feb 2023

Nvidia released the megatron language model before the pandemic. It’s amazing how influential this paper became. A must read for people wanting to learn about AI. arxiv.org/abs/1909.08053

Megatron-LM: Training Multi-Billion Parameter Language Models...

Recent work in language modeling demonstrates that training large transformer models advances the state of the art in Natural Language Processing applications. However, very large models can be...

arxiv.org

129

876

253,648

Nando de Freitas · May 14, 2022 · 8:46 AM UTC

Nando de Freitas

@NandoDF

14 May 2022

Someone’s opinion article. My opinion: It’s all about scale now! The Game is Over! It’s about making these models bigger, safer, compute efficient, faster at sampling, smarter memory, more modalities, INNOVATIVE DATA, on/offline, … 1/N thenextweb.com/news/deepmind…

DeepMind's new Gato AI makes me fear humans will never achieve AGI

DeepMind just unveiled a new AI system called Gato that makes OpenAI's GPT-3 look like a child's toy. But are we any closer to AGI?

thenextweb.com

103

207

837

Nando de Freitas · Dec 7, 2019 · 9:43 PM UTC

Nando de Freitas

@NandoDF

7 Dec 2019

Dear friends, this year I will not attend #NeurIPS2019. This is the last year before my littlest daughter starts going to school. It is important for dads to spend time with their little ones! I wish you a great conference.

842

Nando de Freitas · Apr 9, 2024 · 4:49 PM UTC

Nando de Freitas

@NandoDF

9 Apr 2024

This is by far the best non-technical Natural and Artificial Intelligence book anyone could read. This comprehensive, well-researched, crisply clear, sharply focused and illuminating book is a thing of beauty. It is the book I wish I had had when I started my AI career 30 years ago. The book tells the story of steering, emotions, reinforcement, world models, generative intelligence, counter factual thinking, planning, awareness, theory of mind, tool use, language, GPT4 and more. It is not only a history of intelligence but also a beacon for the future of AI. Thank you @maxsbennett for this jewel. Thanks @serkancabi for sharing it. abriefhistoryofintelligence.…

126

840

188,241

Nando de Freitas · May 5, 2019 · 9:34 PM UTC

Nando de Freitas

@NandoDF

5 May 2019

Beautiful explanation of transformer neural networks jalammar.github.io/illustrat…

212

737

Nando de Freitas · Oct 7, 2025 · 12:03 PM UTC

Nando de Freitas

@NandoDF

7 Oct 2025

I would like to hire exceptional engineers (code, math, science, games, video) . Essential: people who want to transform the world in positive ways by advancing AI. Email me at: JoinAITeam@microsoft.com Preference for generalists who can work with data, model ablations, inference and evals. Exceptional coding skills a must.

761

92,791

Nando de Freitas · Sep 21, 2019 · 9:57 AM UTC

Nando de Freitas

@NandoDF

21 Sep 2019

Last decade in AI was about solving big associative learning: Language models, image labelling, speech recognition, lipreading, Starcraft, etc. A triumph! The recipe was always the same: 1. Big net 2. Massive curated dataset 3. Many iterations. Building the dataset is underrated.

160

715

Nando de Freitas · Nov 22, 2024 · 12:59 PM UTC

Nando de Freitas

@NandoDF

22 Nov 2024

The OpenAI letters: lesswrong.com/posts/5jjk4CDn… Some of what is said here is absolutely shocking. The politics, hysteria, incompetence, power hunger, gaslighting, etc are beyond any HBO show. I was a leading researcher at DeepMind at the time reporting to Demis. Most of what is said about DeepMind in these letters is absolute rubbish. We were simply , scientists, figuring out intelligence, trying to figure out how to do good things with it. Anyone else could do the same and we loved competition. We opened our labs to all these people, especially Elon, and they abused our openness. I made that clear to Greg Brockman at ICLR where I also tried to patch up the newly created animosity among researchers, among colleagues, among friends. This is all the more reason for open AI. The people must decide, and not a bunch of billionaires playing with scientists. Ilya got it prophetically righ: we did all the training, ideas and code, but now our only power left is to protest in apps owned by the same billionaires. But protest we will. AI must be for the people, for all nation states.

OpenAI Email Archives (from Musk v. Altman and OpenAI blog) — LessWrong

As part of the court case between Elon Musk and Sam Altman, a substantial number of emails between Elon, Sam Altman, Ilya Sutskever, and Greg Brockma…

lesswrong.com

722

88,160

Nando de Freitas · Jun 18, 2020 · 8:41 AM UTC

Nando de Freitas

@NandoDF

18 Jun 2020

This is such an important tweet for new researchers: A Turing award winner’s public admission that failure is ok. It’s through trying and failing (ie falsifying some hypotheses) that we make scientific progress. Thanks Geoff for setting a brilliant example.

Geoffrey Hinton

@geoffreyhinton

17 Jun 2020

I thought I had a very good idea about perceptual learning and accepted several invitations to give talks about it next week. But I have just discovered a fatal flaw in the idea, so I am cancelling all those talks. I apologize.

115

716

Nando de Freitas · Dec 31, 2018 · 7:34 AM UTC

Nando de Freitas

@NandoDF

31 Dec 2018

For anyone wanting to make their lectures freely available to everyone . It’s highly rewarding when people approach me at conferences and tell me they got into machine learning by watching my UBC and Oxford lectures. Every course should be online; translated to all languages.

Alan Mackworth @AlanMackworth

30 Dec 2018

Replying to @AlanMackworth @NandoDF @egrefen @zacharylipton @UBC_CS

So we use an NCast Telepresenter to record lectures: ncast.com/ncastproductspr.ht…

100

707

Nando de Freitas · Sep 21, 2025 · 3:08 PM UTC

Nando de Freitas

@NandoDF

21 Sep 2025

Most RL for LLMs involves only 1 step of RL. It’s a contextual bandit problem and there’s no covariate shift because the state (question, instruction) is given. This has many implications, eg DAgger becomes SFT, and it is trivial to design Expectation Maximisation (EM) maximum likelihood solutions that do exactly the same as RL. Of course, RL and multiagent systems will be needed as the picture illustrates.

703

96,479

Nando de Freitas · Jun 28, 2020 · 10:49 AM UTC

Nando de Freitas

@NandoDF

28 Jun 2020

Our field lacks diversity. This is the biggest danger of AI. As we witnessed this week, it is not easy to tear the chains of history. Few of us are able to rise above our environments and see our biases. Fortunately colleagues like @timnitGebru have bravely helped us 1/

152

626

Nando de Freitas · Sep 27, 2025 · 9:06 AM UTC

Nando de Freitas

@NandoDF

27 Sep 2025

The only bitter lesson is that LLMs have succeeded beyond any expert expectations. Underpinning LLMs is the idea of scaling, which is too often misunderstood as more parameters. Scaling is about using massive compute effectively to maximise the throughput of data ingestion into the learning process to obtain more capable models. We are still far from hitting the limits in this. We are still compute hungry because there is a ton more we could achieve if only we had more compute, from experimental ablations to data acquisition and curation. Scaling is largely about data and evals. The models are now trained on almost all the web and equally large (but growing) self generated synthetic data. sifting through such vasts quantities of data (the whole of the human creation) requires formidable engineering and intelligent ideas. This is what differentiates most models. AI is finally in the hands of billions of users, and with it come billions of tasks - every reasonable user need. This scaling in tasks and evaluations is many orders of magnitude larger than pre-LLMs. Having the right architecture matters, but we know several alternatives could all work well, eg replacing attention in Transformers for RNNs and interleaving such layers with local layers. What matters is fine ablations to maximise hardware usage. This is the realm of sophisticated high-precision engineering. It encompasses semiconductor design, datacenter design, distributed systems, MFU, etc. There is fascinating work on flow matching, JEPA, sparser MoEs, etc, that is all consistent with scaling. I’m terrible at predictions, but in this we have stayed the course. There’s been pleasant surprises like the effectiveness of reasoning, which while allowing for less parameters, still demands even more compute. Sparser multimodal MoEs also will allow for better continual learning. This is an old idea, eg arxiv.org/pdf/1108.3298, which is finally being done at scale. Successful scaling is mostly about organising people into effective teams for research, development and production. They have to be teams of happy and ambitious people who put the team first. Yes, tech VCs and CEOs: work life balance matters to achieve prologued success, something I think @demishassabis did really well at @GoogleDeepMind and which I promote at @MicrosoftAI. Bitter lesson: it really is all about scaling and hard work by thousands of amazing people. Hardly bitter, but hopeful and inspiring.

Richard Sutton

@RichardSSutton

26 Sep 2025

Replying to @GaryMarcus @ylecun @demishassabis

You were never alone, Gary, though you were the first to bite the bullet, to fight the good fight, and to make the argument well, again and again, for the limitations of LLMs. I salute you for this good service!

682

195,515

Nando de Freitas · Oct 30, 2025 · 10:10 AM UTC

Nando de Freitas

@NandoDF

30 Oct 2025

We are hiring star research and data engineers to invent the future of AI. JoinAITeam@microsoft.com If you’re finishing your undergrad or PhD at Imperial, Cambridge, Oxford, UCL, Toronto, MIT, MILA, UBC, ETH, Stanford, Caltech, UCLA, Berkeley, CMU, UW, NYU, Princeton, Columbia, Harvard, Yale or any other top school in STEM, please apply too. I love working with energetic people, who are prepared to work on what is needed to shape AI, make it safe, make it brilliant, make it creative, and make it useful in math, science, healthcare, education, energy and environment.

Brad Gerstner

@altcap

29 Oct 2025

Look forward to having @satyanadella & @sama on @BG2Pod tomorrow. The deal. The skeptics. The re-industrialization of America. Power. Chips. Models. Agents. AGI. Regulation. Jobs. And more… 🧐🚀🇺🇸

674

222,989

Nando de Freitas · Sep 1, 2024 · 7:47 AM UTC

Nando de Freitas

@NandoDF

1 Sep 2024

Some companies can turn the most motivated scientists and engineers into unproductive, complacent hamsters. They do this by introducing a large number of levels and process-heavy performance reviews several times a year. People become obsessed with their level, obsessed with comparing against the level of their peers, they choose not to solve hard problems because they rather do something easy and get promoted to the next level. Managers benefiting from this pretend these levels are real. Managers who’d rather focus on products and engineering, have to stop working for a few weeks to write performance reviews, mostly with LLMs these days. The whole thing is toxic and an aberration of real feedback, learning and motivation.

Sean Kelly @seanpk

30 Aug 2024

The way that Jensen Huang runs Nvidia is wild: 40 direct reports. No 1:1s. No formal planning cycles. And no status reports. In a recent interview, he went in-depth on his Leadership style. Every entrepreneur must understand why it works:

611

244,408

Nando de Freitas · Dec 15, 2024 · 10:12 PM UTC

Nando de Freitas

@NandoDF

15 Dec 2024

Let us please talk more about mental health in the AI community. I was shocked and reminded of this by the sad and tragic death of this young colleague with so much talent. Many of the people in our community are likely on the spectrum; ADHD, autism, Asperger’s and so on. This rich neural diversity is likely responsible for great progress in AI, but these people, including myself, are also very vulnerable. AI used to be a small community, where universities provided shelter. But, now, the stakes are very high. There is huge competition among AI corporations in the AI race leading to routine mergers and reorgs, which cause great uncertainty and disruption. Stressed executives apply pressure and pass the stress down to the ICs. Researchers no longer enjoy the freedom to publish at most corporations, which is a huge change to what they did before. I’m not judging whether this is good or bad, just that it is a huge change. Researchers get paid a lot. They are sometimes told by managers to go on the job market to ascertain their value before applying for promotion, they are forced to sign 6-month to 1-year non-competes and notice periods to be able to accept a deserved promotion. This appears to me as a modern form of feudalism. Put simply, people are treated just as any other resource, as stuff. The financial stakes are very high for everyone. Many AI scientists are now media stars. They enjoy huge media exposure, and thousands of followers in social media, but many crave more fame. The potential for huge negative or positive impact also raises the stakes. AI ICs often see themselves as game pieces in a game among nations and corporations that is fraught with uncertainty and power trips. It is hard to tell right from wrong because laws often lag behind. Working in AI is a privilege. I repeat, it is a huge privilege. Yet, when people suffer depression, endure micro-aggressions, or get suicidal thoughts, it doesn’t feel like any of the privileges matter. If you’re feeling any of this, please find a therapist. It may take a few tries, and it may take time. It is worth it. Take it from one of your colleagues who has benefited a lot from PTSD therapy. I believe it has made me a more productive researcher, a more effective collaborator, and someone who appreciates work-life balance and differences in working styles. Please use the help that exists proudly, you’re not alone - you are special and you are loved. Rest in peace, Suchir Balaji. Thank you for everything you gave us ❤️

BNO News

@BNONews

14 Dec 2024

OpenAI whistleblower Suchir Balaji, who accused the company of breaking copyright law, found dead in apparent suicide

650

189,880

Nando de Freitas · Mar 29, 2024 · 3:40 PM UTC

Nando de Freitas

@NandoDF

29 Mar 2024

Predicting the next word "only" is sufficient for language models to learn a large body of knowledge that enables then to code, answer questions, understand many topics, chat, and so on. This is clear to many researchers now, and there are nice tutorials on why this works by @ilyasut resorting to compression ( piped.video/watch?v=AKMuA_TV… ) and by @geoffreyhinton ( piped.video/watch?v=iHCeAotH… ). However, the emergence of types of understanding is not unique to language models. In arxiv.org/pdf/1804.06318.pdf by @notmisha and @brandondamos the authors trained models to predict the next few time stems of over a hundred robot hand sensors (Touch, Gyro, Accelerometer, Joint Info, Actuator Info, etc.). They ten found out that they could regress the shape of the thing the hand was touching from the activations of the neural networks using probes. That is, the model developed an internal representation of shapes even though it was simply used to predict "only" the next few senses. Awareness follows from simple predictions and interaction with the world.

124

641

133,084

Nando de Freitas · Jun 6, 2018 · 10:46 PM UTC

Nando de Freitas

@NandoDF

6 Jun 2018

Building a Deep Neural Network to play FIFA 18 codementor.io/deepgamingai/b…

182

578

Nando de Freitas · Feb 24, 2022 · 8:03 PM UTC

Nando de Freitas

@NandoDF

24 Feb 2022

What role can the AI community play in a world where bullies attack peaceful democratic countries 🇺🇦 and threaten the world? I’m really curious to hear from everyone.

125

540

Nando de Freitas · Oct 8, 2024 · 11:32 PM UTC

Nando de Freitas

@NandoDF

8 Oct 2024

For the first time in my life I can explain what the physics @NobelPrize is about! In fact, if you’d like to learn what is a Hopfield net and how it relates to NP hard satisfiability, Boltzmann machines, autoencoders, score matching, Maxwell demons, maximum likelihood, generative AI, quantum computing, unsupervised learning and neural networks, see these slides and video lectures from a course I taught at @ipam_ucla 2012 helper.ipam.ucla.edu/publica… piped.video/watch?v=XYEs7k… piped.video/watch?v=JlONAaoW… piped.video/watch?v=t9sXdA…

577

96,794

Nando de Freitas · Oct 3, 2019 · 7:56 PM UTC

Nando de Freitas

@NandoDF

3 Oct 2019

27,600 GPUs, 1/2 PB data, and a neural net with 220,000,000 weights. More please! arxiv.org/pdf/1909.11150.pdf

102

543

Nando de Freitas · Sep 3, 2024 · 7:05 PM UTC

Nando de Freitas

@NandoDF

3 Sep 2024

It is remarkable that anyone can now train a 124M parameter LLM in about real-time on a MacBook M3. So easy to experiment. This would have been the stuff of dreams when I was in school. I ❤️ training neural nets, but I really admire the people who build the hardware.

526

46,816

Nando de Freitas · Mar 10, 2019 · 10:49 PM UTC

Nando de Freitas

@NandoDF

10 Mar 2019

⁦Yann @ylecun⁩ is a visionary who advocated for SGD, GANs, convnets, contrastive losses, deep nonlinear models, autodiff modular software, etc when most thought he was joking. As a scientist, don’t just follow the crowd, but innovate, think, test. amp.timeinc.net/fortune/2019…

499

Nando de Freitas · Feb 9, 2025 · 3:51 PM UTC

Nando de Freitas

@NandoDF

9 Feb 2025

Replying to @airkatakana

I think you’re missing a lot of history. @ylecun also championed online learning (SGD) - yes it sounds crazy but there was a time when this wasn’t the majority view. Yann also spearhead the modular approach to NN training, which led to torch, PyTorch, etc. He championed energy based methods and Siamese nets. He was one of the first few to push for training nets with GPUs. So it wasn’t just back prop for convnets, but numerous contributions over the years as well as mentoring many impactful researchers. Yann is simply of the greatest engineers of our time, and when he speaks, I suggest you listen… or live to regret it as I have in the past.

503

158,564

Nando de Freitas · Dec 18, 2023 · 1:20 PM UTC

Nando de Freitas

@NandoDF

18 Dec 2023

I’m with @ylecun and @AndrewYNg on this. I feel it is more responsible to devote greater effort to solving today’s problems (e.g. climate, health, energy, poverty, safety, communication, bias and discrimination, education) than to cultist AI long term speculation.

Yann LeCun

@ylecun

18 Dec 2023

Exactly.

479

148,364

Nando de Freitas · Dec 14, 2018 · 12:14 AM UTC

Nando de Freitas

@NandoDF

14 Dec 2018

“I think the brain isn’t concerned with squeezing a lot of knowledge into a few connections, it’s concerned with extracting knowledge quickly using lots of connections.” Geoff Hinton. wired.com/story/googles-ai-g…

Google’s AI Guru Wants Computers to Think More Like Brains

Google's top AI researcher, Geoff Hinton, discusses a controversial Pentagon contract, a shortage of radical ideas, and fears of an "AI winter."

wired.com

134

469

Nando de Freitas · Oct 26, 2019 · 2:29 PM UTC

Nando de Freitas

@NandoDF

26 Oct 2019

A neural net solves the three-body problem 100 million times faster flip.it/.xDT9n

A neural net solves the three-body problem 100 million times faster

Machine learning provides an entirely new way to tackle one of the classic problems of applied mathematics.In the 18th century, the great scientific challenge of the age was to find a way for …

flip.it

130

485

Nando de Freitas · Feb 21, 2025 · 2:00 PM UTC

Nando de Freitas

@NandoDF

21 Feb 2025

Feeling pressure to work more than 5 days per week? Don’t. I’ve been super productive all my life and I never worked on the weekend unless I was excited about something. I don’t regret a single day of vacation. Quite the opposite. Life is too short. Don’t let CEO talk and corporate bullshit ruin your life and that of your loved ones. I make sure to work with bosses that care about what I deliver and not how hard I should be working. Between 17 and 21 I used to work 14 hours per day, 7 days per week. I was underpaid and got very little out of it. Working smarter has been more effective. Weekends are awesome 🤩

486

30,743

Nando de Freitas · Nov 12, 2019 · 5:35 PM UTC

Nando de Freitas

@NandoDF

12 Nov 2019

The poster has no affiliation because in Guatemala AI research is what committed researchers do at night on their own, after a day of work. Touching and inspiring ⁦@Khipu_AI⁩

473

Nando de Freitas · Jun 16, 2018 · 8:14 AM UTC

Nando de Freitas

@NandoDF

16 Jun 2018

Solving the Rubik's Cube Without Human Data technologyreview.com/s/61128…

A machine has figured out Rubik’s Cube all by itself

Unlike chess moves, changes to a Rubik’s Cube are hard to evaluate, which is why deep-learning machines haven’t been able to solve the puzzle on their own. Until now.

technologyreview.com

172

465

Nando de Freitas · Feb 9, 2019 · 9:58 AM UTC

Nando de Freitas

@NandoDF

9 Feb 2019

Ptolemy was a genius, but alas! a simpler model came along. Keep it simple and ensure it makes good testable predictions. medium.com/tensorflow/mit-de…

MIT Deep Learning Basics: Introduction and Overview with TensorFlow

As part of the MIT Deep Learning series of lectures and GitHub tutorials, we are covering the basics of using neural networks to solve…

medium.com

117

467

Nando de Freitas · Mar 26, 2025 · 8:50 AM UTC

Nando de Freitas

@NandoDF

26 Mar 2025

Dear @GoogDeepMind ers, First, congrats on the new impressive models. Every week one of you reaches out to me in despair to ask me how to escape your notice periods and noncompetes. Also asking me for a job because your manager has explained this is the way to get promoted, but I digress. Please don’t reach out to me. Rather reach out to each other. Your leads are responsible for this. Talk to them. @koraykv and @douglas_eck have both said they’re against it, so maybe start there. Above all don’t sign these contracts. No American corporation should have that much power, especially in Europe. It’s abuse of power, which does not justify any end.

458

86,666

Nando de Freitas · Aug 7, 2025 · 9:18 AM UTC

Nando de Freitas

@NandoDF

7 Aug 2025

Work life balance is a top priority. Yes, there was nearly a decade in my life when I worked 98 hours, but I did it out of need and aspiration, not because anyone forced me. During my Cambridge PhD I published more than anyone around me, but I never worked a single weekend. I focused, worked smart, and delivered. I’ve continued delivering at Berkeley, UBC, Cifar, Oxford, DeepMind, MAI, but I only work hard when I want to achieve something important. I love holidays and my social life. Only my wife is allowed to ask me to work harder 🥰 It’s about results, focus, attention to detail, vision, dreams, collaboration, and not bloody hours. Hours management is pathetic and easy to game and a guaranteed path to mediocrity. If you like my philosophy, we’re hiring engineers of all kinds - message linkedin.com/company/microso… The ‘9-9-6 Work Schedule’ Could Be Coming To Your Workplace Soon via @forbes forbes.com/sites/bryanrobins…

Microsoft AI | LinkedIn

Microsoft AI | 86,581 followers on LinkedIn. At MAI we’re building a new class of safer, more capable AI systems we call Humanist Superintelligence: AI that is always aligned, controllable, and in...

linkedin.com

460

49,658

Nando de Freitas · Dec 21, 2018 · 9:28 AM UTC

Nando de Freitas

@NandoDF

21 Dec 2018

Our meta-learning approach is state-of-the-art in Text-2-Speech, and does it in 5mins instead of 4 hours. This shows that neural nets can work with few data, when we embrace many tasks, and work better! Now a poster 😊 openreview.net/forum?id=rkzj…

120

432

Nando de Freitas · Sep 13, 2019 · 6:15 AM UTC

Nando de Freitas

@NandoDF

13 Sep 2019

I recommend this paper with theoretical and algorithmic insights on metalearning to researchers interested in hierarchical Bayes, MAML, and Reptile. It addresses the idea of learning reusable fixed and adaptive modules across many tasks. ⁦⁦ arxiv.org/abs/1909.05557

Modular Meta-Learning with Shrinkage

Many real-world problems, including multi-speaker text-to-speech synthesis, can greatly benefit from the ability to meta-learn large models with only a few task-specific components. Updating only...

arxiv.org

106

434

Nando de Freitas · Jul 1, 2019 · 10:47 PM UTC

Nando de Freitas

@NandoDF

1 Jul 2019

Machine learning to automatically translate long-lost languages technologyreview.com/s/61389…

Machine learning has been used to automatically translate long-lost languages

Some languages that have never been deciphered could be the next ones to get the machine translation treatment.

technologyreview.com

154

423

Nando de Freitas · Sep 14, 2017 · 10:29 AM UTC

Nando de Freitas

@NandoDF

14 Sep 2017

Deep learning and TF, without a Ph.D. by Martin Görner. Brilliant teaching resource @DeepIndaba cloud.google.com/blog/big-da…

145

436

Nando de Freitas · Sep 14, 2025 · 11:10 AM UTC

Nando de Freitas

@NandoDF

14 Sep 2025

If you have 10K data instances, would you: 1. SFT an LLM with 10K data, or 2. Learn a reward with 5K, and RL the LLM on the remaining 5K with the learned reward 3. Other (explain)?

446

177,380

Nando de Freitas · Jan 28, 2017 · 8:29 AM UTC

Nando de Freitas

@NandoDF

28 Jan 2017

I'm agnostic and not a fan of religion, but I refuse to visit a country that introduces bans on Muslims and refugees. #nips2016 #nips2venues

116

412

Nando de Freitas · Mar 14, 2018 · 9:01 AM UTC

Nando de Freitas

@NandoDF

14 Mar 2018

At 17 from Caracas to Joburg, I stopped in Rio for 3 days. I bought a book. It changed me. It was my company in dark hardworking days selling beer for a living. At 21 I went to university dreaming of understanding the universe. Thanks for writing the book Stephen Hawking. RIP

434

Nando de Freitas · Feb 10, 2018 · 5:27 PM UTC

Nando de Freitas

@NandoDF

10 Feb 2018

Why do we label emotions as positive and negative? It’s unsatisfactory that such a key component of cognition is modelled with such a crass binary classifier. In AI we need to get more serious about this.

110

422

Nando de Freitas · Jul 6, 2021 · 11:02 PM UTC

Nando de Freitas

@NandoDF

6 Jul 2021

The future of deep learning, according to its pioneers – Wise insights. bdtechtalks.com/2021/07/01/d…

The future of deep learning, according to its pioneers - TechTalks

Deep learning pioneers Yoshua Bengio, Geoffrey Hinton, and Yann LeCun discuss current limits of deep learning and future directions of research.

bdtechtalks.com

410

Nando de Freitas · Mar 2, 2019 · 8:34 AM UTC

Nando de Freitas

@NandoDF

2 Mar 2019

Where can we find other cool examples of continuous systems like this one being used to output discrete symbols? Could be potentially nice for neural networks. Thanks!

Fermat's Library

@fermatslibrary

1 Mar 2019

Geneva Drive: A mechanism that converts continuous motion into discrete motion. The name was derived from its usage in mechanical watches which were popularized in Geneva. This mechanism can also be found in movie projectors, banknote counting machines...

408

Nando de Freitas · Dec 7, 2020 · 7:02 PM UTC

Nando de Freitas

@NandoDF

7 Dec 2020

This will be a long thread. It represents my views solely. Many are puzzled by why I feel it possible to support both @JeffDean and @timnitGebru so I’d like to explain. I will start by saying that this in no way denies any current or past injustices. 1/n

425

Nando de Freitas · Aug 24, 2019 · 10:13 AM UTC

Nando de Freitas

@NandoDF

24 Aug 2019

I agree. This was a phenomenal paper. I’m hoping it will inspire researchers to probe further.

Ian Osband

@IanOsband

23 Aug 2019

Looking back over the year, the one paper that gave me the best "aha" moment was... Reconciling Modern Machine Learning and the Bias-Variance Tradeoff: arxiv.org/abs/1812.11118 The "bias-variance" you knew was just the first piece of the story!

418

Nando de Freitas · Oct 23, 2019 · 8:22 AM UTC

Nando de Freitas

@NandoDF

23 Oct 2019

The @OpenAI paper is excellent. It is an achievement and it highlights the biggest AGI challenges (perception, long-horizon, exploration, motor control, compositionality, meta and continual learning, embodiment). What humans find obvious are the hardest things to solve in AGI 1/2

OpenAI

@OpenAI

15 Oct 2019

We've trained an AI system to solve the Rubik's Cube with a human-like robot hand. This is an unprecedented level of dexterity for a robot, and is hard even for humans to do. The system trains in an imperfect simulation and quickly adapts to reality: openai.com/blog/solving-rubi…

402

Nando de Freitas · Sep 30, 2018 · 12:10 PM UTC

Nando de Freitas

@NandoDF

30 Sep 2018

ICLR 2019 lessons thus far: The deep neural nets have to be BIGGER and they’re hungry for data, memory and compute. GANs, Res-blocks, LSTMs, convnets, & multiagent tricks are doing the job.

111

410

Nando de Freitas · Dec 20, 2018 · 9:58 AM UTC

Nando de Freitas

@NandoDF

20 Dec 2018

My slides for the #NeurIPS2018 Meta-Learning are now up. Big thanks to the organisers! metalearning.ml/2018/

115

399

Nando de Freitas · Mar 2, 2018 · 2:06 PM UTC

Nando de Freitas

@NandoDF

2 Mar 2018

It is beyond any doubt that over the next few years we will perfect the technology for automatically generating a video of anyone saying anything we type, with the right voice too. What implications do you think this will have? What are the applications? How do we mitigate risks?

171

388

Nando de Freitas · Aug 17, 2024 · 12:27 PM UTC

Nando de Freitas

@NandoDF

17 Aug 2024

What happens to a company is not only the result of how good and committed their tech people are, but it is greatly influenced by the business decisions, the leadership, and the operating environment. Work life balance is important whether in a company or a startup (speaking from experience in both). Would you want to miss an important event in your child’s life so that a decade later the boss for whom you worked so hard comes out and blames you going home in time for dinner as the reason why the company is doing poorly? This is not denying that work from home policies aren’t influential. However, it is only one of many factors that impact the ranking of a company in a leaderboard. From what I see, Google continues to be extremely pioneering and impactful. Its scientists and engineers are truly exceptional and worthy of admiration, and a bit more respect from previous leads.

392

30,280

Nando de Freitas · Mar 13, 2019 · 9:57 AM UTC

Nando de Freitas

@NandoDF

13 Mar 2019

For anyone interested in meta-learning / learning to learn / continual learning / robotics / imitation, I’ll be giving a talk covering these topics at the Turing Institute in London.

The Alan Turing Institute @turinginst

11 Mar 2019

“The illiterate of the 21st century will not be those who cannot read and write, but those who cannot learn, unlearn, and relearn.” - Alvin Toffler We welcome @NandoDF, whose talk will focus on building tools that learn how to learn. MORE INFO: bit.ly/NandoTalk

381

Nando de Freitas · Sep 20, 2024 · 11:01 PM UTC

Nando de Freitas

@NandoDF

20 Sep 2024

The @OpenAI o1 models represent one of the smartest advances in AI in a long time. Having just joined @Microsoft AI, one of the things I really look forward to is being able to contribute to some of these fruitful ideas to advance OpenAI’s mission. The opportunity to work together with the many clever engineers and researchers at OpenAI and Microsoft and contribute to their projects is a huge privilege, and humbling 😅

OpenAI

@OpenAI

12 Sep 2024

We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond. These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. openai.com/index/introducing…

366

78,044

Nando de Freitas · Oct 21, 2025 · 7:23 AM UTC

Nando de Freitas

@NandoDF

21 Oct 2025

One of the most important papers of the year.

vLLM

@vllm_project

20 Oct 2025

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support. 🧠 Compresses visual contexts up to 20× while keeping 97% OCR accuracy at <10×. 📄 Outperforms GOT-OCR2.0 & MinerU2.0 on OmniDocBench using fewer vision tokens. 🤝 The vLLM team is working with DeepSeek to bring official DeepSeek-OCR support into the next vLLM release — making multimodal inference even faster and easier to scale. 🔗 github.com/deepseek-ai/DeepS… #vLLM #DeepSeek #OCR #LLM #VisionAI #DeepLearning

383

72,075

Nando de Freitas · Oct 12, 2025 · 1:34 PM UTC

Nando de Freitas

@NandoDF

12 Oct 2025

Machines that can predict what their sensors (touch, cameras, keyboard, temperature, microphones, gyros, …) will perceive are already aware and have subjective experience. It’s all a matter of degree now. More sensors, data, compute, tasks will lead without any doubt to the “I think therefore I am” moment for computers, and we’re not ready for it yet. arxiv.org/pdf/1804.06318 share.google/kxx6WyqHpwPmo6Q…

380

174,554

Nando de Freitas · Jul 1, 2024 · 11:28 AM UTC

Nando de Freitas

@NandoDF

1 Jul 2024

It amazes me that so many (even tech) people still don't get the transformative power of generative AI. The best text to image and video models in existence today all use synthetically generated captions (see e.g. Dalle 3 paper). Human generated data is substantially inferior. Generated text, images, video, touch, sound, radar, graphs, etc will simply become the data for training new more powerful foundation models. Moreover, if one has finetuned (or prompted) foundation models to evaluate the generated data, then RL (aka self learning) is trivial to implement. Finally, if one can imagine what will happen, one can control. This is the basis of model predictive control (MPC). So, we have only started to witness the impact of Gen AI for iterative improvement and control. This is the AlphaGo story repeating itself, but now with all types of human tasks, knowledge and modalities.

Aran Komatsuzaki

@arankomatsuzaki

1 Jul 2024

Scaling Synthetic Data Creation with 1,000,000,000 Personas - Presents a collection of 1B diverse personas automatically curated from web data - Massive gains on MATH: 49.6 ->64.9 repo: github.com/tencent-ailab/per… abs: arxiv.org/abs/2406.20094

320

79,349

Nando de Freitas · Oct 19, 2024 · 7:27 PM UTC

Nando de Freitas

@NandoDF

19 Oct 2024

Replying to @ylecun

I partly disagree, Yann. First, I called on both big tech and government to be clear. Second, Tech corporations are partly responsible and cannot be excused. SF is where they operate at large, and they have huge influence on who gets elected and what laws are passed. They lobby and have a strong voice. Why in a city where so many are either in tech or unemployed, tech companies try to excuse themselves? Also, why do corporations advertise missions like “make the world a better place” and do nothing about the problems in their neighbourhood? Corporations are exploiting entities unless employees have a say. For example, California companies enforce outrageous 6 month non-competes for junior AI scientists and engineers in London even though they cannot do it their Bay Area home. The consequence of this behaviour for the future of AI is profound: The AIs in London will enforce non-competes, the AIs in some countries will be homophobic, and so on. There’s complex ethical issues here that we are ignoring, like the homeless in SF. Tech is influential enough that if they choose to do something about it, it will improve. It will also be better for them in the long run because it’s their home.

340

47,981

Nando de Freitas · Jan 25, 2023 · 9:41 AM UTC

Nando de Freitas

@NandoDF

25 Jan 2023

This video on distributed neural net training should be part of every machine learning & AI course in the world. It is brilliantly done. Thanks ⁦@gbarthmaron⁩ for pointing me to it. Has anyone already converted it to a homework exercise in a class? microsoft.com/en-us/research…

ZeRO & DeepSpeed: New system optimizations enable training models with over 100 billion parameters...

The latest trend in AI is that larger natural language models provide better accuracy; however, larger models are difficult to train because of cost, time, and ease of code integration. Microsoft is...

microsoft.com

368

68,663

Nando de Freitas · Jun 8, 2025 · 10:47 PM UTC

Nando de Freitas

@NandoDF

8 Jun 2025

Many people work through the weekend, without time to announce it on Twitter. I spent my Sunday happily with my family and friends, supporting and sharing love. I feel as a result energised and I’m looking forward to a productive week. Having time off is wonderful for physical and mental health. I too spent yesterday learning because I enjoy it. I don’t even think of it as work and I can’t wait to apply it next week. Yet, I do feel for the millions who worked through the weekend because they had no other option. I look forward to a world where more people can have the weekend off and to do something meaningful.

This tweet is unavailable

367

38,894

Nando de Freitas · Sep 17, 2020 · 9:17 AM UTC

Nando de Freitas

@NandoDF

17 Sep 2020

So glad to see the value of scientific software frameworks being properly recognised.

NumPy @numpy_team

16 Sep 2020

The NumPy paper is out! nature.com/articles/s41586-0…

ALT A screenshot of the NumPy article from the Nature homepage.

348

Nando de Freitas · Feb 18, 2024 · 9:30 AM UTC

Nando de Freitas

@NandoDF

18 Feb 2024

I agree with @DrJimFan. Life, with all its midblowing structure, is about creating order in a universe of increasing disorder, see e.g. newscientist.com/article/232… for an easy intro. Like a cell, a neural network during training takes energy to minimise disorder, that is to predict and generalise better. In fact we even call the loss negative entropy. Like life, the net is part of a bigger environment that gives it data and feedback. Like life, the process results in a lot of disorder for the universe (TPU and GPU heat). In summary we have all the ingredients for intelligence (an emergent property of life), including our understanding of physics. I’d be thankful if someone makes a crisper version of this argument. The only way a *finite sized* neural net can predict what will happen in any situation is by learning internal models that facilitate such predictions, including intuitive laws of physics. Given this intuition, I cannot find any reason to justify disagreeing with @DrJimFan. With more data of high quality, electricity, feedback (aka fine tuning, grounding), and parallel neural net models that can efficiently absorb data to reduce entropy, we will likely have machines that reason about physics better than humans, and hopefully teach us new things. Incidentally, we are the environment of the neural nets too, consuming energy to create order (e.g. increasing quality of datasets for neural net training). These are old ideas going back to Boltzmann and Schrodinger among others. They provide the theoretical foundations. Now, it’s about building the code and conducting the experiments, and doing so *responsibly* and *safely* because these are very powerful technologies.

Is life the result of the laws of entropy?

Nearly 80 years ago, Erwin Schrödinger used the physics of the day to try to understand the origins of life. Now, Stephon Alexander and Salvador Almagro-Moreno try to do the same with modern science

newscientist.com

Jim Fan

@DrJimFan

16 Feb 2024

I see some vocal objections: "Sora is not learning physics, it's just manipulating pixels in 2D". I respectfully disagree with this reductionist view. It's similar to saying "GPT-4 doesn't learn coding, it's just sampling strings". Well, what transformers do is just manipulating a sequence of integers (token IDs). What neural networks do is just manipulating floating numbers. That's not the right argument. Sora's soft physics simulation is an *emergent property* as you scale up text2video training massively. - GPT-4 must learn some form of syntax, semantics, and data structures internally in order to generate executable Python code. GPT-4 does not store Python syntax trees explicitly. - Very similarly, Sora must learn some *implicit* forms of text-to-3D, 3D transformations, ray-traced rendering, and physical rules in order to model the video pixels as accurately as possible. It has to learn concepts of a game engine to satisfy the objective. - If we don't consider interactions, UE5 is a (very sophisticated) process that generates video pixels. Sora is also a process that generates video pixels, but based on end-to-end transformers. They are on the same level of abstraction. - The difference is that UE5 is hand-crafted and precise, but Sora is purely learned through data and "intuitive". Will Sora replace game engine devs? Absolutely not. Its emergent physics understanding is fragile and far from perfect. It still heavily hallucinates things that are incompatible with our physical common sense. It does not yet have a good grasp of object interactions - see the uncanny mistake in the video below. Sora is the GPT-3 moment. Back in 2020, GPT-3 was a pretty bad model that required heavy prompt engineering and babysitting. But it was the first compelling demonstration of in-context learning as an emergent property. Don't fixate on the imperfections of GPT-3. Think about extrapolations to GPT-4 in the near future.

331

155,100

Nando de Freitas · Nov 15, 2019 · 10:42 AM UTC

Nando de Freitas

@NandoDF

15 Nov 2019

It is incredibly transformative when a tech leader like @JeffDean participates at events like @Khipu_AI and @DeepIndaba He gives hope, builds confidence, and helps create new opportunities and collaborative communities across the world. #inspiring #masakhane @GoogleAI

Roberta Duarte @import_robs

14 Nov 2019

I just met the amazing @JeffDean from @GoogleAI at @Khipu_AI. He was so nice to hear about my project and to agree in taking a picture with me! Thank you so much! This moment is one of the bests in my life.

333

Nando de Freitas · Mar 11, 2023 · 1:21 PM UTC

Nando de Freitas

@NandoDF

11 Mar 2023

Diffusion as a neural net, a language model in jax, attention and transformers — some slides from my ⁦@Khipu_AI⁩ tutorial

340

42,575

Nando de Freitas · Mar 13, 2017 · 11:21 PM UTC

Nando de Freitas

@NandoDF

13 Mar 2017

Google Brain’s new super fast and highly accurate AI: the Mixture of Experts Layer. medium.com/@thoszymkowiak/go…

156

342

Nando de Freitas · May 12, 2022 · 4:25 PM UTC

Nando de Freitas

@NandoDF

12 May 2022

Two years in the making by a talented, collaborative, and fun team, and with enormous help and support from many others at @DeepMind. No better place to be! Congrats @scott_e_reed on this step.

Google DeepMind

@GoogleDeepMind

12 May 2022

Gato🐈a scalable generalist agent that uses a single transformer with exactly the same weights to play Atari, follow text instructions, caption images, chat with people, control a real robot arm, and more: dpmd.ai/Gato Paper: dpmd.ai/Gato-paper 1/

328

Nando de Freitas · Mar 31, 2017 · 7:28 AM UTC

Nando de Freitas

@NandoDF

31 Mar 2017

One-Shot Imitation Learning - This paper is excellent. arxiv.org/abs/1703.07326

One-Shot Imitation Learning

Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from...

arxiv.org

105

329

Nando de Freitas · Dec 3, 2019 · 11:55 PM UTC

Nando de Freitas

@NandoDF

3 Dec 2019

This type of intelligence always amazes me.

Massimo

@Rainmaker1973

3 Dec 2019

When scientists put slime mold over a map of Tokyo, with food used to represent urban areas, and after a day the mold created a network nearly identical to Tokyo's rail network: all this without any brain ow.ly/7CA730o4Sjk

321

Nando de Freitas · Dec 28, 2019 · 4:01 PM UTC

Nando de Freitas

@NandoDF

28 Dec 2019

It would help the discussion if everyone first 1. reads a causal inference book, eg oapen.org/download?type=docu…, 2. watches a deep learning course emphasising modularity, compositionality and automatic differentiation, 3. implements the CI book examples in eg @PyTorch

Judea Pearl

@yudapearl

28 Dec 2019

I am retweeting this reply, for it crystallizes my position in the latest conversation on the relationships between DL (deep learning) and CI (causal inference) with @tdietterich , @ylecun, @GaryMarcus, @rodneyabrooks and significant others. #Bookofwhy

324

Nando de Freitas · May 8, 2025 · 7:02 AM UTC

Nando de Freitas

@NandoDF

8 May 2025

Why don’t you compete with ChatGPT? Last time I checked Gemini app was an order of magnitude behind. More seriously, this arrogance and comparative disparaging is toxic and not needed in our community. Do the best work you can to help those with less privilege. That’s it.

Jonas Adler

@JonasAAdler

6 May 2025

Competing with ourselves is getting a bit boring

332

102,628

Nando de Freitas · Apr 4, 2024 · 10:49 PM UTC

Nando de Freitas

@NandoDF

4 Apr 2024

Thanks @osanseviero and @huggingface for inviting me to a wonderful AI dinner, where I had the pleasure of catching up with old friends, @sarahookr @neilzegh @laurentsifre @ylecun, meet new amazing people, and do one of the things I absolutely love: brainstorm about AI with people who are passionate about it and its impact. 🤗

313

109,001

Nando de Freitas · Dec 7, 2016 · 12:29 PM UTC

Nando de Freitas

@NandoDF

7 Dec 2016

What a amazing authoritative tutorial on deep reinforcement learning at #nips2016 - a must read set of slides! people.eecs.berkeley.edu/~pa…

123

309

Nando de Freitas · Jan 25, 2018 · 11:55 PM UTC

Nando de Freitas

@NandoDF

25 Jan 2018

#ICLR 2019 could happen in Cape Town, South Africa. It’s time for an ML conference to go to Africa. It’s the right thing to do, and we all know it.

307

Nando de Freitas · Jan 15, 2022 · 9:43 AM UTC

Nando de Freitas

@NandoDF

15 Jan 2022

Here is a great challenge for quadruped robotics — Super Cat Intelligence. 🐭 risky, but nonetheless good benchmark tasks

This tweet is unavailable

299

Nando de Freitas · Nov 5, 2017 · 8:41 AM UTC

Nando de Freitas

@NandoDF

5 Nov 2017

This is one of the most thought provoking and transforming books I have read recently. Solid research, heart-warming. Highly recommend it.

Susan David, Ph.D.

@SusanDavid_PhD

6 Sep 2016

“This book had a profound impact on my day-to-day life.” Order #EmotionalAgility here: buff.ly/2c8uYnm

310

Nando de Freitas · Jan 10, 2019 · 7:03 PM UTC

Nando de Freitas

@NandoDF

10 Jan 2019

Neural Ordinary Differential Equations .... blog explains blog.acolyer.org/2019/01/09/…

303

Nando de Freitas · Oct 20, 2019 · 8:12 AM UTC

Nando de Freitas

@NandoDF

20 Oct 2019

I find it funny folks are focusing on the symbolic challenge. The big challenge is attaching that hand to a moving controllable robot arm, and preferably having two coordinated hands learning diverse behaviours by RL, from sensors, with low sample complexity and in a safe manner.

Gary Marcus

@GaryMarcus

19 Oct 2019

Since @OpenAI still has not changed misleading blog post about "solving the Rubik's cube", I attach detailed analysis, comparing what they say and imply with what they actually did. IMHO most would not be obvious to nonexperts. Please zoom in to read & judge for yourself.

298

Nando de Freitas · May 29, 2019 · 7:48 AM UTC

Nando de Freitas

@NandoDF

29 May 2019

The Illustrated BERT, ELMo, and co. (How NLP Cracked Transfer Learning) jalammar.github.io/illustrat…

297

Nando de Freitas · Sep 6, 2025 · 7:29 AM UTC

Nando de Freitas

@NandoDF

6 Sep 2025

I loved this research paper on Flow Matching, the most popular approach for video gen. TLDR: More data means harder to fit any specific data (say image), better generalisation, greater coverage of learned concepts. Pretraining: Use Billions of data to learn many things and none too well. Postraining: Use few data, eg to quickly overfit to a certain style. Too few or too many won’t work. Question for Twitter: Does this apply to text LLMs? What have you experienced? arxiv.org/abs/2506.03719

On the Closed-Form of Flow Matching: Generalization Does Not Arise...

Modern deep generative models can now produce high-quality synthetic samples that are often indistinguishable from real training data. A growing body of research aims to understand why recent...

arxiv.org

311

34,516

Nando de Freitas · Mar 10, 2019 · 8:27 PM UTC

Nando de Freitas

@NandoDF

10 Mar 2019

Today I felt very positive about the work I’ve been doing with students at UBC and Oxford, and with colleagues at DeepMind when (1) a complete stranger approached me at a Camden gym and said “thank you for your lectures” - a nice gesture that made my day, and

294

Nando de Freitas · Jun 6, 2017 · 6:36 AM UTC

Nando de Freitas

@NandoDF

6 Jun 2017

One of the most important deep learning papers of the year, thus far.

Adam Santoro @santoroAI

6 Jun 2017

Excited about our new paper on relational reasoning, with @dnraposo @PeterWBattaglia and others at DeepMind. arxiv.org/abs/1706.01427

100

301

Nando de Freitas · Jun 24, 2020 · 8:16 AM UTC

Nando de Freitas

@NandoDF

24 Jun 2020

This is a superb and inspiring artificial intelligence talk. The best I’ve heard this year. Anyone interested in vision, control, robotics or managing AI projects should watch this. Well done @karpathy piped.video/g2R2T631x7k via @YouTube

290

Nando de Freitas · Nov 30, 2024 · 6:21 PM UTC

Nando de Freitas

@NandoDF

30 Nov 2024

I used a metaphor that upset some people. I welcomed the feedback, and I have deleted the tweet. I also apologise to anyone I might have offended. The rest of the tweet is about something very important that we scientists and engineers in AI need to talk about. California corporations are forcing existing employees in the UK and Europe to sign 6 month to 1 year non-competes. These contracts are not signed at the start, but after many years when the companies have disproportionate power over the employees. Imagine receiving a promotion because of merit and hard work, being happy about it because you’re starting a new family, but being told that you won’t get it unless you sign a 1 year non-compete. It is not easy to leave the job then. It’s hopeless for the scientist, engineer or any other worker. So you sign. You then cannot leave because you won’t be able to do AI research for a long time, and some of your promised money isn’t paid. It is emotionally abusive and terrible for scientific progress and technological development. The AI companies that enforce this in London and European capitals don’t do it in California. This puts Europe at a huge disadvantage too. It’s non-competitive. These are the companies claiming they are building the good AIs. Good? No, they are building AIs to exploit the laws of countries, created for other purposes, to exploit people. @GoogleDeepMind is particularly bad at this. They have been forcing researchers going up for promotion based on merit to sign 6 month non-competes, 1 year non solicits, and 6 month notice periods (garden leaves). This has made it clear that they don’t trust their own employees and will do whatever it takes to stifle their competition. Researchers leaving have been forced to comply. The ironic thing is that they can’t enforce this in their California home. The executives don’t believe in it. They are exploiting our laws to their benefit. Colleagues and friends (@icmlconf @NeurIPSConf @theinformation). Do NOT sign these contracts. They can’t make you do it. Without us engineers and AI people, their stock is worthless. Don’t let them abuse you. This applies to people joining my team, do not sign these restrictions on your freedom to do research and work. @GoogleDeepMind has wonderful people and very dear friends of mine who I look up to, but this practice of @Google should be illegal. It should be illegal for other corporations too because it stifles research in AI and progress. It is also pathetic. @GoogleDeepMind can do much better: Do the right thing and make us proud please. The European Union @vestager should do something about this. It should be illegal and companies should be punished if they choose to stifle competition in Europe.

286

77,408

Nando de Freitas · Oct 20, 2021 · 9:12 AM UTC

Nando de Freitas

@NandoDF

20 Oct 2021

If you’re advertising a machine learning or AI scholarship or job on Twitter, please consider announcing it to @QueerinAI @AiDisability @black_in_ai @Khipu_AI @DeepIndaba @_LXAI @WiMLworkshop @women_in_ai and other groups who care about diversity and inclusion. Thanks

273

Nando de Freitas · Jun 22, 2018 · 11:46 PM UTC

Nando de Freitas

@NandoDF

22 Jun 2018

Adobe is using machine learning to make it easier to spot Photoshopped images theverge.com/2018/6/22/17487…

Adobe is using machine learning to make it easier to spot Photoshopped images

Using AI to spot Photoshopped images

theverge.com

276

Nando de Freitas · Dec 9, 2020 · 12:45 PM UTC

Nando de Freitas

@NandoDF

9 Dec 2020

This is EPIC!! It will go in history as one of the best #neurips talks of all times. ⁦@isbellHFh⁩ and colleagues 💜💙💚💛🧡❤️ on: Can’t Escape Hyperparameters and Latent Variables: Machine Learning as a Software Engineering Enterprise nips.cc/virtual/2020/public/…

273

Nando de Freitas · Dec 10, 2023 · 9:56 AM UTC

Nando de Freitas

@NandoDF

10 Dec 2023

Once upon a time at a #NeurIPS party. So much changed. Thanks @sirbayes for finding this.

285

58,499

Nando de Freitas · Sep 26, 2025 · 8:03 AM UTC

Nando de Freitas

@NandoDF

26 Sep 2025

After 55, one reflects on life. For me, mathematics was the most beautiful world I encountered.

284

25,515

Nando de Freitas · Oct 22, 2021 · 3:55 PM UTC

Nando de Freitas

@NandoDF

22 Oct 2021

Shaking the foundations: delusions in sequence models for interaction and control. I learned so much from Pedro Ortega in this thought-provocative AI project. Great way to spend time with a friend at a London pub. arxiv.org/pdf/2110.10819.pdf

273