Research Scientist at @GoogleDeepMind working on Gemini and Search.

London, UK
AI is a form of empirical philosophy. I bet that if Plato, Phyrro, Descartes or Wittgenstein were around now, they’d be tinkering with neural networks. With language models, with generative models and with agents.
33
64
467
Now for something different! Deep RL + GAN training + CelebA = artificial caricature. Agents learn to draw simplified (artistic?) portraits via trial and error. @ #NeurIPS2019 creativity workshop. Animated paper: learning-to-paint.github.io PDF: arxiv.org/abs/1910.01007 Thread.
6
189
550
"Neural scene representation and rendering" now in @sciencemagazine. By training deep networks to predict what scenes look like from new viewpoints, we get them to understand images: deepmind.com/blog/neural-sce… @DeepSpiker @OriolVinyalsML @theophaneweber @demishassabis
5
160
491
Conditional Neural Process implementation by Marta Garnelo! Neural networks meet stochastic processes. Useful for few-shot regression, classification and meta-learning: 1. Repo: github.com/deepmind/conditio… 2. Notebook: github.com/deepmind/conditio… 3. Paper: arxiv.org/abs/1807.01613
5
178
474
This started as a prototype we built back in March 2024 which we called ‘Neural Google’. The core idea: wrap Gemini around Google and let it do the searching for you.
24
26
457
54,828
This is AGI complete
15
45
429
38,617
Gemini 2.5 Pro just got even better at code ✨ #1 on LMArena with 1448 Elo, #1 on WebDev Arena with 1420 Elo. Also SOTA for video, with 84.8% on VideoMME. @TimBettridge vibe-coded a 3D tour of the Art Institute of Chicago's collection with it, right in @GeminiApp Canvas 🎨
9
35
402
27,926
Notebooks for Neural Processes (NPs) and Attentive Neural Processes (ANPs) now available. Compared to CNPs (published last year): 1. NPs model the function with a global latent, and 2. ANPs fit the data better. Kudos to @hyunjik11 and Marta Garnelo et al! github.com/deepmind/neural-p…
3
81
371
Getting closer to the dream! A network that uses unlabelled images to boost performance when labels are scarce (new SOTA), and it's no worse than ResNet when labels are plentiful. Also: Unsupervised net + just a linear on top outperforms original AlexNet! arxiv.org/abs/1905.09272
4
99
350
Introducing PolyGen: an autoregressive model of 3D meshes. arxiv.org/abs/2002.10880 Transformers + Pointer Nets = train on raw mesh data (i.e. variable-length lists of vertices and faces). No need to voxelise or rasterise! with @charlietcnash @yaroslav_ganin @PeterWBattaglia
5
51
274
better late than never lots of rough edges still but the team is grinding do share feedback github.com/google-gemini/gem…
16
18
287
26,021
Clean and performant PyTorch implementation of Generative Query Networks by Shohei Taniguchi: github.com/iShohei220/torch-… Pixyz implementation by Shohei Taniguchi and Masahiro Suzuki: github.com/masa-su/pixyzoo/t… Pixyz (deep generative modeling library): github.com/masa-su/pixyz 👌
83
260
Genie 3 is the most impressive AI demo I've seen since ChatGPT.  In 2016 we were working on 'Neural Representation and Rendering' and could already see a vague path to this. But I didn’t think it’d happen so soon.
12
28
236
28,927
@irinavlh, @DaniloJRezende and I are presenting a tutorial at ICML on 'Representation Learning Without Labels'. Jul 13, 9 AM to 12 PM and 7 PM to 10 PM (BST) icml.cc/virtual/2020/tutoria… icml.cc/Conferences/2020/Sch… Drop by to find out what Plato has to do with VAEs, GANs and SimCLR!
5
51
233
We're hiring a research scientist to join our Quantum Chemistry and Materials team ⚛️🚨 The team is working on using machine learning to better our understanding of the universe, down at the level of quantum physics. See: deepmind.com/blog/simulating… Share: boards.greenhouse.io/deepmin…
4
49
207
53,948
This is a fantastic resource. It was written before the deep revolution and therefore provides good context for it all. I spent most of the the first year of my PhD jumping from chapter to chapter of this book.
"Pattern Recognition and Machine Learning" by @ChrisBishopMSFT is now available as a free download. Download your copy today for an introduction to the fields of pattern recognition & machine learning: aka.ms/prml #ML #Insights
3
41
198
Cool work by Brett Göhre, showing that a GQN trained on synthetic data can be leveraged to transfer to real images. Impressive results! docs.google.com/presentation… @brett_gohre
1
45
197
Love this visualisation. Drives home just how two-dimensional vision is, and how much work our brains do to make it feel 3D. We see the world through a small flat window. @UnitreeRobotics
2
19
181
35,507
Slides now available to download: drive.google.com/file/d/1Ee2… Thank you all for attending the morning session. We'll be back online tonight. Tune in for more Q&A.
@irinavlh, @DaniloJRezende and I are presenting a tutorial at ICML on 'Representation Learning Without Labels'. Jul 13, 9 AM to 12 PM and 7 PM to 10 PM (BST) icml.cc/virtual/2020/tutoria… icml.cc/Conferences/2020/Sch… Drop by to find out what Plato has to do with VAEs, GANs and SimCLR!
3
44
178
Probabilistic U-Nets adapted to produce calibrated uncertainties. Very important for clinical deployment of segmentation networks. Cool work!
NEW PAPER @miccai2019! "Supervised Uncertainty Quantification for Segmentation with Multiple Annotations". We adapt Prob Unet to output epistemic & CALIBRATED aleatoric uncertainties arxiv.org/abs/1907.01949. Work w. Shi Hu, Stefan Knegt, @BasVeeling, Henkjan Huisman & @wellingmax
32
178
A visual introduction to probability and statistics: seeing-theory.brown.edu/inde… 👌
47
165
If you're interested in interning at DeepMind, the deadline for applications is Oct 29th. You need to be in the last two years of your PhD programme, and be available for 14-20 weeks in 2019. It doesn't matter what country you're based in. Get in touch with me! همین الان!
13
50
165
Introducing Gemini 2.5 Pro 🌀 which thinks natively and is SOTA across a number of key math, reasoning and science benchmarks
6
6
161
9,310
🍌🤖💀✨
7
10
160
14,021
Wow, very impressive samples by an ICLR 2019 submission (I had nothing to do with this paper). Crazy to think how much information there is hidden in a collection of images. Enough to allow a model to generalise this convincingly. Paper: openreview.net/pdf?id=B1xsqj…
2
26
157
Absolutely mind blowing talk about non-neural computation, a.k.a. 'primitive cognition'. Liquefied brains that retain their memories, two headed worms, salamander tails that turn themselves into legs, and much more. Highly recommended, fascinating watch: bit.ly/2Qgls71
2
42
150
This figure from the impressive DINOv3 paper is fun to think about. Pretend it's 2018 and you're deciding what research to focus on. Self supervised is <40% and supervised >80%. Would you bet on SSL ever catching up? Some people were believers even then. Have faith!
Introducing DINOv3 🦕🦕🦕 A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale. High quality dense features, combining unprecedented semantic and geometric scene understanding. Three reasons why this matters…
6
14
143
15,455
A comprehensive overview of the Neural Process Family: - What do they have to do with Neural Networks? - What do they have to do with Gaussian Processes? - What does it all have to with Meta Learning? - What advances have been made in the last 4 years? arxiv.org/abs/2209.00517
2
30
126
Look mum, no NeRF! And from a single reference image. Absolutely gorgeous.
Another one. Already a powerful painting, but moving around it yourself gives a totally different feeling. Jacques Louis David's "The Death of Socrates" => #Genie3
11
7
135
9,737
Contrastive Training for Improved Out-of-Distribution Detection arxiv.org/abs/2007.05566 Joint (cross entropy + SimCLR) training gives your network a feature space that is better for OOD detection than cross entropy training alone.
2
17
112
Lateral thinking: "Wait a minute... could I turn one of the numbers upside down?"
Replying to @OfficialLoganK
It’s still an early version, but check out how the model handles a challenging puzzle involving both visual and textual clues: (2/3)
3
4
100
15,132
I worked with Felix on a paper in 2021. I remember Felix as a consistently kind person, and as a brilliant thinker. I'm sharing his farewell letter, as I think it's what he would have wanted. Please be mindful, it's not an easy read: docs.google.com/document/d/1… Rest in peace.
Do you work in AI? Do you find things uniquely stressful right now, like never before? Haver you ever suffered from a mental illness? Read my personal experience of those challenges here: docs.google.com/document/d/1…
3
3
96
13,440
Research scientist internship applications are now open for London. First deadline: Oct 4th. Second window: Dec 6th - Dec 17th. Just do it! حتی شما I expect things like e.g. geography / nationality to be less of a blocker than before. deepmind.com/careers/jobs/25…
3
10
94
Differentiable Monte Carlo ray tracing. Very cool. "We interface [the method] with PyTorch and show prototype applications in inverse rendering and the generation of adversarial examples for neural networks." Now we just need to make it faster! people.csail.mit.edu/tzumao/…
21
91
Josh Tenenbaum on artificial intelligence @icmlconf
1
20
84
1 click address2watercolour brought to you by 🍌 the Edinburgh flat i grew up in all those years ago 🥲 going to send this to my parents!
8
5
90
8,374
Very understandable yet detailed explanation of GQN 👌
Replying to @binarybits
Mind-blowing stuff from Google's DeepMind. arstechnica.com/science/2018…
1
16
88
GQN + time: Instead of predicting what a scene looks like from a new viewpoint, predict what it will look like at a new timestamp. For consistent samples, introduce global latent variable. Very cool work by @ananyaku @DeepSpiker @mpshanahan et al. deepmind.com/documents/227/c…
3
24
81
DALL-E 2 and Flamingo are the most impressive AI demos that I've ever seen. I wouldn't have predicted that we'd be here if you'd asked me 2 years ago. Not even in the best case scenario.
Introducing Flamingo 🦩: a generalist visual language model that can rapidly adapt its behaviour given just a handful of examples. Out of the box, it's also capable of rich visual dialog. Read more: dpmd.ai/dm-flamingo 1/
1
7
78
2007. Me watching Jobs' iPhone keynote: "Dumbest idea ever. Browsing on the go? No keyboard? Is he high?" 2012. Me watching the AlexNet talk: "Dumbest idea ever. NNs can't do cats vs dogs, why jump to 1000-way classification?" Lesson: try suspension of disbelief once in a while
5
1
73
Hundreds of researchers make thousands of discoveries. Most researchers focus only on a specific part of the problem, and yet when all those discoveries are put together, it compounds. Pretty incredible to witness.
Exciting News from Chatbot Arena! @GoogleDeepMind's new Gemini 1.5 Pro (Experimental 0801) has been tested in Arena for the past week, gathering over 12K community votes. For the first time, Google Gemini has claimed the #1 spot, surpassing GPT-4o/Claude-3.5 with an impressive score of 1300 (!), and also achieving #1 on our Vision Leaderboard. Gemini 1.5 Pro (0801) excels in multi-lingual tasks and delivers robust performance in technical areas like Math, Hard Prompts, and Coding. Huge congrats to @GoogleDeepMind on this remarkable milestone! Gemini (0801) Category Rankings: - Overall: #1 - Math: #1-3 - Instruction-Following: #1-2 - Coding: #3-5 - Hard Prompts (English): #2-5 Come try the model and let us know your feedback! More analysis below👇
1
3
75
18,811
predictions for 2024: - vision+language models go continuous and real-time (not just turn-based) - nerfs/splats/etc get strong priors (only 1 image of a complex test scene to get full 3D) - generative video models reach photo-realism - major election scandal powered by an LLM
3
3
68
8,624
This is very cool. Impressive action-conditioned, language-conditioned, or un-conditioned rollouts of videos. We worked on this hard at DM back in 2016/2017, but it was very difficult to get working then. Huge progress!
It’s #GAIA1 world, we just drive in it. Generate realistic driving videos using only prompts. See how it works below 🧵
2
3
69
15,551
Exciting updated results for self-supervised representation learning on ImageNet: - 71.5% top-1 with a *linear* classifier - 77.9% top-5 with only *1%* of the labels - 76.6 mAP when transferred to PASCAL VOC-07 (better than *fully-supervised's* 74.7 mAP) arxiv.org/abs/1905.09272
1
9
70
Computer vision is far from solved. Nice slides by Thomas Funkhouser precisely describing a few of the open problems, along with supervised learning solutions. Question is, how can we learn all of these capabilities with less/no supervision? cs.princeton.edu/~funk/bridg…
19
68
Mind-blowing. Between this and DALL-E, I genuinely believe that our relationship with the concept of an 'image' is changing, forever. There will now be a period of human history before such models, and a period after. Amazing work @Chitwan_Saharia, @wchan212, @mo_norouzi et al.
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding project page: gweb-research-imagen.appspot… sota FID(7.27 on COCO), without ever training on COCO, human raters find Imagen samples to be on par with the COCO data itself in image-text alignment
2
3
60
Very cool. Nice to see generative models used in new and innovative ways... deepmind.com/blog/alphafold/
12
64
Two reasons why vision is hard: 1. 2D images are always only /projections/ of an underlying 3D reality. 2. Sometimes we're interested in classifying 3D realities that are only subtly different from each other. We're currently better at problem 2 than problem 1.
1
16
64
Very impressive to see high-level, multi-player strategy emerging from pure RL on raw pixels. Could this be how we design game AIs of the future? Cool new work by @maxjaderberg et al. deepmind.com/blog/capture-th…
1
21
64
"It's true that life is short. The solution isn't to do things quickly, but to do things the way they're meant to be done."
6
60
Sequential Attend, Infer, Repeat: A generative model of moving objects. arxiv.org/abs/1806.01794 Very cool work by @arkosiorek.
15
59
Charlie has just released code + colab for PolyGen, a generative model of vertices and faces. Check it out!
We've just released code for PolyGen, our generative model of 3D meshes github: github.com/deepmind/deepmind…
1
57
Our paper 'A Probabilistic U-Net for Segmentation of Ambiguous Images', led by @saakohl, will be presented at #NIPS2018! Code is available at github.com/SimonKohl/probabi…
Our paper `A Probabilistic U-Net for Segmentation of Ambiguous Images' was accepted at #NIPS2018 as a spotlight presentation! A re-implementation of the code is now available at github.com/SimonKohl/probabi…. Paper arxiv.org/abs/1806.05034 by @DeepMindAI and @mic_dkfz.
16
60
GQN + attention: Better likelihoods, faster training, more complex scenes (Minecraft). Given a neural scene representation, can you localise new images? Great work by @danrsm @DeepSpiker @fabiointheuk et al. deepmind.com/documents/229/g…
18
54
Amazing to see how far we've come in about 10 years. Compare piped.video/watch?v=tk9FTdKO… (~state of the art at CVPR 2012) with the video in the tweet below.
Palette: Image-to-Image Diffusion Models abs: arxiv.org/abs/2111.05826 project page: iterative-refinement.github.… a simple and general framework for image-to-image translation using conditional diffusion models
1
9
52
If you're ever training a (potentially conditional) VAE and find yourself struggling to keep the KL down, you need to use this. Figure 5 on page 9 is a good start. @DeepSpiker @fabiointheuk
Taming VAEs: A theoretical analysis of their properties and behaviour in the high-capacity regime. We also argue for a different way of training these models for robust control of key properties. It was fun thinking about this with @fabiointheuk arxiv.org/abs/1810.00597
8
52
Surely it's better to self-learn vision from videos than from just images? Surprisingly, that doesn't seem to have been the case... until now. A simple objective (VITO) + a balanced video dataset (VideoNet) shows us a path forwards! @nikparth1, @joaocarreira, @olivierhenaff
2
6
45
I'd forgotten how nice it is to talk with real 3D humans in a real 3D room. Grateful to be back at it after 2 long years.
Excellent talk on representation learning and neural scene understanding by DeepMind’s Ali Eslami @arkitus at @kth_rpl summer school. #KTHRPLSummerSchool
45
When starting out in a new field: 1. Work on ideas you believe in 2. Work with people you learn from 3. Have fun
2
2
44
If you're curious about how/why the aggregation functions of GQNs and NPs work (in particular, the summing aggregation function), see the paper below.
7
43
I would've considered this pure sci-fi not too long ago! We can now teach a model new concepts by showing it a sequence of examples of images and associated text. No param updates required. Concepts: e.g. how to classify, caption or answer Qs Model: pre-trained language model
1
2
43
A friend of mine (editor of an influential arts magazine) is looking to hire an intern to apply neural rendering (e.g. NeRF and family) to high-end fashion shoots with professional photography. If this is something you or someone you know might be interested in, DM me 💃🕺
4
6
45
Both released today. Left: A @NASAWebb image of Stephan's Quintet, five tightly-bound galaxies 290 million light-years away. Right: @clured's visualisation of 10 million chunks from the data used to train @BigscienceW and @huggingface's BLOOM model. Encoded then UMAPed to 2D.
1
5
43
Mason McGough @MasonMMcGough has written a nice piece on generating 3D models with PolyGen and PyTorch. Check it out! towardsdatascience.com/gener…
1
4
42
If you're interested in state-of-the-art machine learning research AND in positive real-world impact, consider joining the health research team in London. I've been collaborating with this team for a while and they are truly awesome 👌
Interested in state-of-art, clinically-applicable deep learning research & positive real-world impact? We are growing our health research team in London, with EHR & Imaging roles for talented deep learning research scientists & engineers- Get in touch if this sounds like you :)
2
5
40
Bayesian optimisation is an efficient strategy for optimization of black-box functions without derivatives. Here we show how Neural Processes can be used for this: arxiv.org/abs/1903.11907 With: @schwarzjn_, @agalashov, @hyunjik11, Marta Garnelo, @dwsaxton, @pushmeet, @yeewhye
Interested in adversarial tests and reinforcement learning? We combine meta-learning in a general probabilistic paradigm to detect failures, helping us build robust algorithms. Includes results on recommender systems and control: arxiv.org/abs/1903.11907 @schwarzjn_ @agalashov
3
42
As someone who largely ignored biology in school and at uni, this resonated with me deeply. I'm only now beginning to learn about biology and chemistry and it's all so incredibly beautiful. jsomers.net/i-should-have-lo…
1
2
38
Nice overview of the "predictive coding" theory of the brain, and how GQN relates to it. By Jordana Cepelewicz @QuantaMagazine quantamagazine.org/to-make-s…
14
37
A striking vision expressed by a single artist working with a powerful tool. An idea that would have likely taken a team of people to produce previously, and at much greater cost.
"Voyage through Time" is my first artpiece using #stablediffusion and I am blown away with the possibilities... We're crossing a threshold where generative AI is no longer just about novel aesthetics, but evolving into an amazing tool to build powerful, human-centered narratives
3
40
love the authors list on the CodeGemma paper: storage.googleapis.com/deepm…
1
2
36
2,832
When sufficiently constrained, agents learn to paint surprisingly abstract images. Some of the paintings remind me of cubist portraits. (Remember: no imitation or supervision). Can you spot any familiar faces? See learning-to-paint.github.io for loads more emergent drawing styles.
3
9
35
Excellent tutorial on generative models.
The slides of our CCN2018 tutorial can now be found here: tinyurl.com/ydyzvkbd
9
36
Of course it does NOT follow that all neural network tinkerers are therefore great philosophers 🥴
1
36
Interested in generative models, 3D computer vision or inverse graphics? We use ideas and techniques from these fields to show the possibility of imaging very small objects (e.g. proteins) more effectively. In this setting we cannot fall back to supervised learning!
Proteins are not static bricks! Feasibility study to infer a continuous distribution of all states using an end-to-end model from Cryo-EM images to atom coordinates: arxiv.org/abs/2106.14108. @danrsm, @GarneloMarta, @MichaelZielins, @JonasAAdler, @arkitus, @CarlDoersch, @pushmeet
2
6
28
"Persian" x "Robot" #dalle
2
2
31
If you've been wondering how Generative Query Networks should be trained in the absence of camera positions (e.g. in a SLAM setting), this paper offers a possible solution. Very cool work!
Happy share our work: Shaping Belief States with Generative Environment Models for RL Thanks Karol Gregor, Frederic Besse, Yan Wu, Hamza Merzic and @avdnoord ! arxiv.org/abs/1906.09237v2 piped.video/dOnvAp_wxv0 #RL #SelfSupervised #GenerativeWorldModels #BeliefStates
2
28
Neural Processes (NPs) generalise GQN’s training regime to other few-shot prediction tasks arxiv.org/abs/1807.01622, arxiv.org/abs/1807.01613. Awesome work by Marta Garnelo who will be presenting at ICML: bit.ly/2MQ8wi3 , tinyurl.com/npaticml
8
30
Yesterday's dreams are today's reality. Incredible stuff.
Introducing GAIA-2 🌎Generative world modeling just stepped up a gear. GAIA-2 is the latest development of Wayve’s video-generative world model tailored for driving. GAIA-2 offers richer, more realistic, and highly controllable synthetic driving scenarios, accelerating Wayve’s path to safe driver assistance and automated driving at scale. Learn more about GAIA-2 in our Blog: wayve.ai/thinking/gaia-2/ #GAIA2 #GAIA #EmbodiedAI
1
28
1,618
For those of you attending NeurIPS, be sure to check out Olaf's talk at the medical imaging workshop. He'll be speaking about state-of-the-art machine learning for medicine, including our work on using probabilistic models to allow such systems to express uncertainty.🤔💉👌
Looking forward to the Medical Imaging meets NeurIPS Workshop next week Saturday (Dec, 8th). At 9:45am I'll present our work on radiotherapy planning (arxiv.org/abs/1809.04430), triaging eye diseases (nature.com/articles/s41591-0…) and the probabilistic u-net (arxiv.org/abs/1806.05034)
5
31
Reasoning training is getting AIs to be patient with their thoughts: “Don’t rush”. Agent training is getting AIs to be persistent with their actions: “Don’t give up”.
1
2
27
3,110
Excellent walk through of Neural Processes by @kasparmartens. Marta Garnelo, Dan Rosenbaum, @DeepSpiker, @schwarzjn_, @yeewhye and others.
Neural Processes - what they are and how they behave as distributions over functions. This blog post is my attempt to answer these questions: kasparmartens.rbind.io/post/…
14
29
Abstraction vs realism. With generative models, which use cases require one more than the other?
2
1
29
- Earth isn't at the centre of the universe. - Humans and animals share a huge chunk of their DNA. - And it's increasingly looking likely that brains aren't the only intelligent things around. We're not as special as we think.
2
2
27
Excellent overview of why this topic is important: towardsdatascience.com/the-q…
2
8
25
Painter = directed search e.g. neuroevolution Critic = img+txt encoder e.g. ALIGN or CLIP Artist = human that sets the txt input to the critic arxiv.org/abs/2105.00162 Creative work by @chrisantha_f with Jean-Baptiste Alayrac, @MirowskiPiotr, Dylan Banarse, @sindero Thread👇
1
7
23
New work with colleagues from @DeepMindAI: Kickstarting Deep Reinforcement Learning, proposes a paradigm where 'teacher' agents help train 'student' agents. Benefits include faster research cycles and students that can surpass their teachers: arxiv.org/abs/1803.03835
2
26
My suggestion: 1. Fill out the form. 2. Send an email with CV attached to 5 researchers at DM you'd like to work with (senior or junior), indicating research interests, mentioning you've already submitted the form. Some researchers won't / can't reply. 3. Resume life as normal.
25
someone needs to make a modern documentary about intelligence for the general public not on AI but intelligence itself we have many great docus on the wonder of the cosmos and not just on spaceships too much emphasis on the artifact and not enough on the phenomenon imo
1
25
2,134
First high-resolution image of Ultima Thule. The object is miniscule, only 19 km long, but it's over 6 billion km from earth. Read this nitter.app/Alex_Parker/status/107… for a fascinating sneak peak into what it takes to do this kind of research.
2
25
This is just the start. Coming up: better planning, more features, and more advanced agentic setups. There will be rough edges, but the team is shipping fast. Working on the rollout to more countries and languages. Any feedback let me know!
3
26
1,786
Start with 2 copies of an LLM. Every day, for each LLM: 1. Ask what it would like to read that day (eg selections of news or new books) 2. Feed it the content 3. Ask for learnings and conclusions and save to disk 4. Fine tune it on all its learnings from its first day until now
2
1
25
Come see our poster on Neural Processes at @icmlconf: Hall B poster 130. Marta Garnelo woop woop!
3
25
It's clear to me that this can 'just' be scaled up now. The simulations will look just as real as the best movies or video or image models, but also feel as interactive as the best video games. It will fundamentally change how we think about simulations and entertainment.
1
4
26
1,900
Timely and important paper by @dbalduzzi, Marta Garnelo, @maxjaderberg and others on how you should train agents when there is no single winning strategy, e.g. in StarCraft. Thread below has a good summary.
Excited to share some new work on learning in games: arxiv.org/abs/1901.08106. The paper is about formulating useful objectives in nontransitive games (e.g. poker or StarCraft), which turns out to be a surprisingly subtle problem.
4
24
Spend a tiny bit of compute to decide if each datapoint should be trained on or not. Because of all the datapoints you SKIP, training is dramatically more efficient overall.
So excited to announce what we've been working on for the past ~year or so: Active Learning Accelerates Large-Scale Visual Understanding We show that model-based data selection efficiently and effectively speeds up classification- and multimodal pretraining by up to 50%
1
1
22
3,382