ex-google brain and deepmind. phd in neural nonsense from stanford.

San Francisco, CA
Pinned Tweet
Real-world models are here! Stoked to share how we're bringing real-world locations to life by integrating Street View into Genie. Try it now at labs.google/fx/projectgenie and read the blog for more info: blog.google/innovation-and-a…
20
93
618
222,251
Happy to announce DreamFusion, our new method for Text-to-3D! dreamfusion3d.github.io We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed! Joint work w/ the incredible team of @BenMildenhall @ajayj_ @jon_barron #dreamfusion
127
1,388
5,458
How to upset the (few remaining) neuroscientists at NeurIPS 101
54
130
2,116
296,131
Stoked to share our work on Imagen Video! Diffusion models continue to unlock new possibilities for generative creativity: 3D with #DreamFusion last week, video with #ImagenVideo today 😎
Excited to announce Imagen Video, our new text-conditioned video diffusion model that generates 1280x768 24fps HD videos! #ImagenVideo imagen.research.google/video… Work w/ @wchan212 @Chitwan_Saharia @jaywhang_ @RuiqiGao @agritsenko @dpkingma @poolio @mo_norouzi @fleet_dj @TimSalimans
2
43
450
Happy to announce that I’m officially a doctor and will be joining Google Brain today! After an awesome time adventuring, I’m excited to get back to work on understanding and advancing artificial intelligence.
26
10
413
Text-to-3D synthesis from "a DSLR photo of a hand drawing a picture of a hand with a pencil" Doesn't quite master the recursion, but love the quality of the floating hand. #dreamfusion
7
37
396
Ridiculously good looking GAN results from Karras et al. (NVIDIA) by progressively growing the network: research.nvidia.com/sites/de…
4
176
390
What does research look like when you can no longer read all the relevant research papers? I used to read the daily feed, but now it's a full-time job just to read all the abstracts.
The number of AI papers on arXiv per month grows exponentially with doubling rate of 24 months. How can we cope with this? AI itself can help, by predicting & suggesting new research directions. Predicting the Future of AI with AI: arxiv.org/abs/2210.00881
15
37
381
Brush🖌️ is now a competitive 3D Gaussian Splatting engine for real-world data and supports dynamic scenes too! Check out the release notes here: github.com/ArthurBrussee/bru…
12
61
383
28,842
Stop watching videos, start interacting with worlds. Stoked to share CAT4D, our new method for turning videos into dynamic 3D scenes that you can move through in real-time!
12
49
362
45,953
Excited to share our work on image-to-3D scene generation: cat3d.github.io CAT3D uses a multi-view diffusion model to generate novel views, and just inputs these to NeRF/3DGS. Create anything in 3D in 1 minute!
16
50
367
48,148
Evolution is catching up to intelligent design for neural net architectures (94.6% vs. 96.7% on CIFAR-10): arxiv.org/abs/1703.01041
5
144
320
Successfully defended my PhD today! Thanks to @SuryaGanguli @drfeifei @ermonste @dyamins and Tom Clandinin for taking it easy on me :)
23
5
323
Free lunch theorem: for any idea, there exists a dataset where that idea performs well.
8
127
318
no thank you #neuralink
20
25
293
TIL tf.image.resize != torchvision.transforms.Resize unless you set antialias=True. Something to check when porting and comparing models between frameworks 🙃
5
39
306
We just released an example notebook for unrolled GANs on github! Very easy to implement using TF's graph_replace: github.com/poolio/unrolled_g…
1
136
304
Want to estimate or optimize mutual information using neural networks and the latest variational bounds? Check out our Colab notebook for implementations and experiments! Colab: colab.research.google.com/gi… Paper: arxiv.org/abs/1905.06922
2
75
303
🤯
4
27
293
Big hierarchical VQ-VAEs with autoregressive priors do amazing things. Awesome work from @catamorphist @avdnoord @OriolVinyalsML: arxiv.org/abs/1906.00446
2
77
300
BigBiGAN shows that "progress in image generation quality translates to substantially improved representation learning performance." Competitive w/self-supervised approaches on ImageNet. The cycle from generative models to other methods and back again continues.
Replying to @BrundageBot
Large Scale Adversarial Representation Learning. Jeff Donahue and Karen Simonyan arxiv.org/abs/1907.02544
3
65
286
Excited to share that DreamFusion has won an Outsanding Paper Award at #ICLR2023: blog.iclr.cc/2023/03/21/anno… Thanks to amazing coauthors @BenMildenhall @ajayj_ @jon_barron and great feedback from colleagues and reviewers that improved the paper. See y'all in Rwanda!
Happy to announce DreamFusion, our new method for Text-to-3D! dreamfusion3d.github.io We optimize a NeRF from scratch using a pretrained text-to-image diffusion model. No 3D data needed! Joint work w/ the incredible team of @BenMildenhall @ajayj_ @jon_barron #dreamfusion
22
20
282
33,156
Cool work on opening closed eyes with GANs: bdol.github.io/exemplar_gans… Would love to see this productionized so I don't need to worry about staring into the sun, blinking, or sleeping in lectures.
7
79
263
Join our team and build the future of generative worlds! We are at an incredibily exciting moment where research prototypes are becoming useful technology for capture, creation, and interaction in 3D worlds.
We're hiring for full-time roles in NYC and SF, link to the listing is below.
2
19
283
43,882
peer review in machine learning is broken #ICLR2020
10
37
262
unfortunate ICLR metareview typo: "slightly under the acceptance trashhold"
6
27
263
Stoked to share our work on realtime interactive video models 🌎🕹️🎉
What if you could not only watch a generated video, but explore it too? 🌐 Genie 3 is our groundbreaking world model that creates interactive, playable environments from a single text prompt. From photorealistic landscapes to fantasy realms, the possibilities are endless. 🧵
7
15
270
14,240
Overly useful ML hack: to increase the gradient for a parameter w by a factor k, divide the initial value by k and scale w by k before using it: w = Variable(w0) → w = k * Variable(w0/k)
7
32
256
"bear trying a too soft mattress" #dreamfusion
2
13
240
"a photo of a red panda reading the research paper 'attention is all you need'" #Imagen
10
18
246
critical capybara #veo3
3
29
242
23,044
Come intern at Google DeepMind in 2025! We've got a rad generative 3D crew in SF 🤓
Our group at Google DeepMind is now accepting intern applications for summer 2025. Attached is the official "call for interns" email; the links and email aliases that got lost in the screenshot are below.
4
21
248
37,001
time for a nap
4
9
237
16,638
the best current method for text-to-3d scenes is text-to-video followed by 3D reconstruction
will it nerf? yep ✅ congrats to @_tim_brooks @billpeeb and colleagues, absolutely incredible results!!
3
28
229
31,126
Interested in deep learning, mutual information, and variational bounds? Come check out my poster w/ @sherjilozair @avdnoord @alemi and @georgejtucker at 17:30 in the #NeurIPS2018 Bayesian Deep Learning workshop!
1
62
230
It's impossible to keep up with papers, but please take a few days to review literature and ask around before spending months on a research project #reviewer2 #neurips2019
9
9
230
speech cloning + few shot image generation = unlimited fake video content of anyone doing anything. deepfakes was just the tip of the iceberg.
Our neural network based system learned to "clone" a voice with less than a minute of audio data from the speaker. Check out our paper to find out more about this latest breakthrough in speech synthesis. #DeepLearning #MachineLearning #AI bit.ly/2GvGhBP
1
120
222
A voice of reason at the BigNeuro panel: "we are very very far from human-level AI... maybe decades or centuries" - Yoshua Bengio #NIPS2017
10
78
216
Replying to @ericjang11
For diffusion models you can just combine the score functions! See e.g. arxiv.org/abs/2206.01714 I'm guessing this is how MJ integrated SD so quickly: classifier/score-based guidance makes for easy composability of models and signals.
1
19
202
DreamFusion generates 3D models from diverse text prompts. Check out our gallery of hundreds of 3D models: dreamfusion3d.github.io/gall…
2
24
191
New paper on an information-theoretic framework for understanding VAEs! Points to challenges and new directions. arxiv.org/abs/1711.00464
1
58
191
Denoising autoencoders fit to medical records learn a representation that predicts patient outcome: nature.com/articles/srep2609…
2
94
187
Congrats to the Luma team on reproducing and launching DreamFusion just 2 months after paper release! Looking forward to seeing what folks create. Checkout our work to learn about the method behind this tech: dreamfusion3d.github.io
✨ Introducing Imagine 3D: a new way to create 3D with text! Our mission is to build the next generation of 3D and Imagine will be a big part of it. Today Imagine is in early access and as we improve we will bring it to everyone captures.lumalabs.ai/imagine
7
20
187
Derive GANs from mutual information! Y ~ Bernoulli(1/2), M = Y * P + (1-Y) * Q (equal mixture of P and Q). JS(P; Q) = I(Y; M) = H(Y) - H(Y|M) = log(2) - E[log p(y|m)] = log(2) - E[log q(y|m)] + E[KL(p(y|m) || q(y|m))] >= log(2) - E[log q(y|m)]
5
34
183
Reviewer 2: proposed method shows no improvement on ImageNet, weak reject
4
8
189
Diffusion models rock: stable training, high quality samples, improved diversity, and moderately fast sampling. Awesome work from @prafdhar and @unixpickle showing that improved diffusion architectures + classifier guidance outperforms GANs on ImageNet.
Diffusion Models Beat GANs on Image Synthesis Achieves 3.85 FID on ImageNet 512×512 and matches BigGAN-deep even with as few as 25 forward passes per sample, all while maintaining better coverage of the distribution. arxiv.org/abs/2105.05233
3
26
185
New notebook implementing Adversarial Variational Bayes in TensorFlow: gist.github.com/poolio/b71eb…
1
70
181
donut house flythrough #veo2
2
10
175
11,246
we have liftoff #veo3
3
12
183
10,989
So many incredible generative 3D papers submitted to ICLR! One year after DreamFusion and folks have improved efficiency to 20 seconds: instant-3d.github.io/ 🤯 Happy weekend reading 🙃 openreview.net/group?id=ICLR…
2
16
181
41,216
Happy one week anniversary to #DreamFusion (dreamfusion3d.github.io/) 🥳 Thanks to GitHub user ashawkey, you can try it out now: github.com/ashawkey/stable-d… Amazed at the speed of the open source community, and power of open diffusion models. Can't wait to see what people create!
A implementation of text-to-3D dreamfusion, powered by stable diffusion github: github.com/ashawkey/stable-d…
2
34
169
10 years ago I was working on deep models for single neurons and couldn't believe this slide. Crazy how right @ilyasut has been about AI progress, but neuroscience is still so hard.
8
8
162
19,004
knock knock #veo3
3
18
171
18,722
This ain't right. Timnit is one of the most amazing researchers and authentic humans in our field, and we were blessed to have her at Google. We have to do better.
Apparently my manager’s manager sent an email my direct reports saying she accepted my resignation. I hadn’t resigned—I had asked for simple conditions first and said I would respond when I’m back from vacation. But I guess she decided for me :) that’s the lawyer speak.
1
15
150
We have been calling this issue where the learned 3D model has multiple faces the Janus problem (en.wikipedia.org/wiki/Janus) h/t @jon_barron View-dependent prompting helps, but doesn't solve it in all cases as seen with the DreamFusion model of the squirrel below.
Replying to @_akhaliq
Failure cases: "A DSLR photo of a squirrel"
6
31
159
Woohoo, big congrats to the World Labs team! Tech looks similar to CAT3D (cat3d.github.io): multi-view diffusion model + 3DGS, maybe w/360 data + depth priors. To bring these worlds to life with dynamics, check out our new work on CAT4D: cat-4d.github.io 😺
We’ve been busy building an AI system to generate 3D worlds from a single image. Check out some early results on our site, where you can interact with our scenes directly in the browser! worldlabs.ai/blog 1/n
2
9
157
15,762
"Racism is a well-oiled machine. It really doesn't require people to proactively do much at this point, it can perpetuate itself... AI has lubricated that, it's like using WD-40 on this extremely well-lubricated machine to begin with." - @red_abebe at @ResistanceAI panel
2
36
134
Come learn about variational bounds of mutual information tomorrow (Thursday) at #ICML2019, 4:40pm in the Grand Ballroom or drop by poster #86 at 6:30pm! Joint work w/awesome collaborators @sherjilozair @avdnoord @alemi @georgejtucker arxiv.org/abs/1905.06922
29
158
Woohoo! This is Google's fork of IPython notebooks w/ multiple users, remote kernels, and more goodies. Hope it merges back into Jupyter.
One of my favorite internal Google tools is now available externally colab.research.google.com
2
48
150
Folks, please spend more time searching for prior work. This paper on the flipped adversarial autoencoder (arxiv.org/abs/1802.04504) is the same as InfoGAN (arxiv.org/abs/1606.03657). We need better ways to summarize, distill, and distribute research so this stops happening.
8
16
148
When arguing with reviewers that they misunderstand prior work, keep in mind that they may be the author of that prior work.
8
11
143
Logging into Twitter after the CVPR deadline...

ALT 혼파망 피자 GIF

6
145
19,552
Learning to learn has gone from a fringe research area to a Google I/O keynote in just 1 year. The pace of progress in ML is insane.
4
53
145
Exciting topic, bold claims: "This paper explains why deep learning can generalize well...". Looking forward to reading!
Generalization in Deep Learning. (arXiv:1710.05468v1 [stat.ML]) ift.tt/2hKCZ30
4
34
138
Super duper excited to share our new paper on score-based generative modeling! Stable training, exact likelihoods, high resolution samples, and much much more! Amazing work from @YSongStanford's internship with us at @GoogleAI 🧵👇
Happy to announce our new work on score-based generative modeling: high quality samples, exact log-likelihoods, and controllable generation, all available through score matching and Stochastic Differential Equations (SDEs)! Paper: arxiv.org/abs/2011.13456
1
11
139
Cool work on Machine Theory of Mind from Neil Rabinowitz et al. (@DeepMindAI): arxiv.org/abs/1802.07740 Learns a system that can build models of other agents from observations alone. Neat direction for human-machine interaction and understanding of artificial agents!
1
54
136
The 3D model we generate is an improved NeRF that produces a 3D volume with density, color, and surface normals:
3
22
135
Optimize for simplest mask that confuses classifier to get interpretable explanations. Neat work from Fong&Vedaldi: arxiv.org/abs/1704.03296
1
44
134
Jointly train classifier & adversarial example generator, GAN-style -> improved adv. robustness & generalization arxiv.org/abs/1705.03387
1
38
131
One paper can change your life. But which one? Overproductivity doesn't just come from paper counting, but from the desperate acts of young researchers under extreme pressure to be part of that one paper.
Since we just wrapped up an AI megaconference, it felt like a good day to plead for fewer papers. argmin.net/p/too-much-inform…
1
2
134
44,788
veo team is hiring, join the fun :) the yeti videos are cool, but there's still so much unknown in how to build spatial intelligence and useful creative tools!
Want to be part of a team redefining SOTA for generative video models? Excited about building models that can reach billions of users? The Veo team is hiring! We are looking for amazing researchers and engineers, in North America and Europe. Details below:
4
7
126
11,719
Humans are still the best lossy image compressors: arxiv.org/abs/1810.11137 Human describer views input image and communicates text to human reconstructor who uses image editing software to recreate an image. Fun paper led by 3 *high school* students interning at Stanford!
3
33
119
Want to learn latent variable models with powerful decoders? Check out our #iclr2019 accepted paper w/Ali Razavi, @avdnoord, @OriolVinyalsML that prevents posterior collapse withs δ-VAEs: arxiv.org/abs/1901.03416 Key idea: choose family of q(z) such that KL(q(z) || p(z)) > δ
19
120
Check out time-warped PCA, an unsupervised approach to aligning neural data, tonight @ #cosyne17 poster III-14 w/@niru_m @ItsNeuronal #twpca
4
34
117
Every time I fire a l̵i̵n̵g̵u̵i̵s̵t̵ graphics researcher, performance goes up.
"Sora is a data-driven physics engine."
6
5
116
17,579
TL;DR: use x * sigmoid(x) for neural net activations. Multiplicative interactions strike again!
Replying to @Miles_Brundage
"Swish: a Self-Gated Activation Function," Ramachandran et al.: arxiv.org/abs/1710.05941
5
27
114
Nothing like waking up to see the 3D models we generated yesterday 3D printed in the real world today 😍 #dreamfusion
1
3
116
the sweater frogs can moooove #veo2
3
4
115
5,764
no i'm eating
3
6
113
Woohoo, code is now available for the kickass image VAE work from... *drumroll* ... the team @OpenAI!
code released for Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images pdf: openreview.net/pdf?id=RLRXCV… github: github.com/openai/vdvae
4
13
114
MNIST is down: yann.lecun.com/exdb/mnist/ Maybe it's a sign I should try out some other datasets...
10
4
111
Headed to Kigali for #ICLR2023! Excited to meet folks and share the research that generated these 3D models
2
2
111
12,434
ReconFusion = 3D Reconstruction + Diffusion prior for novel view synthesis reconfusion.github.io Better NeRFs, less data.
Excited to share ReconFusion! 3D reconstruction of real-world scenes from only a few photos, powered by diffusion priors: reconfusion.github.io w/ amazing team @ChrisWu6080 @BenMildenhall @philipphenzler @KeunhongP @RuiqiGao @watson_nn @_pratul_ @dorverbin @jon_barron @poolio
5
10
106
17,929
diffusion = flow matching great blog post from the experts on the synergy of these frameworks. parameterization and weighting matters, and i love how the choices in flow matching lead to simpler implementations compared to our early score-based SDE/diffusion model work!
A common question nowadays: Which is better, diffusion or flow matching? 🤔 Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.
2
9
110
15,263
Love this work from Justin Gilmer et al. (Google Brain) on "adversarial spheres." Presents a tractable toy model for adversarial examples and proves that small misclassification error yields adversarial examples in high dimensions: arxiv.org/abs/1801.02774
2
27
105
Wild to compare #veo2 below to Imagen Video (SOTA from just 2 years ago):
A cat jumps on a couch. #veo2
4
6
103
19,161
Progressive distillation (arxiv.org/abs/2202.00512) is awesome! Generation time can be reduced from 10 minutes to 30 seconds with minimal loss in quality.
Replying to @hojonathanho
With the help of progressive distillation, Imagen Video can generate high quality videos using just 8 diffusion steps per sub-model. This speeds up video generation time substantially, by a factor of ~18x.
2
9
107
Excited to present our award-winning DreamFusion research today at #ICLR2023! Talk at 3:40pm in AD12, and poster #73 at 4:30pm. Have a few souvenirs to distribute too 🐸👻🐷🐶
11
104
9,810
catpacking #veo2
1
10
108
20,954
The highest quality 3D reconstruction pipeline is now open source!
We just finished a joint code release for CamP (camp-nerf.github.io/) and Zip-NeRF (jonbarron.info/zipnerf/). As far as I know, this code is SOTA in terms of image quality (but not speed) among all the radiance field techniques out there. Have fun! github.com/jonbarron/camp_zi…
7
101
10,027
This was an incredibly fun team effort w/ NeRF wizards @BenMildenhall & @jon_barron, and NeRF + diffusion expert @ajayj_ (graduating this year!). We're excited to incorporate our methods with open source models and enable a new future for 3D generation! 🚀 #dreamfusion
6
9
103
StyleGAN-T generates faster and better samples than diffusion models at lower resolution (64x64) but underperforms at higher res (256x256). Excited to learn some new GAN tricks and for more diversity in research ideas around generative models :)
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis significantly improves over previous GANs and outperforms distilled diffusion models in terms of sample quality and speed abs: arxiv.org/abs/2301.09515 project page: sites.google.com/view/styleg…
2
17
103
28,295
We need a @schmidhubered consulting service. Tell them your idea, get back a list of references you have missed. Honestly, it would be a great service to the community and useful at the early stages of research projects.
8
22
98
Check out our poster on expressivity of random neural networks! Tonight 6-9pm #95 #nips2016
1
21
102
One day after complaints of the NIPS capsules paper, we've got a shiny new one! Patience, DL community. Patience. openreview.net/pdf?id=HJWLfG…
1
36
100
"Timnit responded with an email requiring that a number of conditions be met in order for her to continue working at Google" - @JeffDean (platformer.news/p/the-wither…) Geez, must be some pretty wild demands to lose such an amazing colleague. Wonder what they are? 😠
Replying to @debarrosmarcelo
Easy. 1 Tell us exactly the process that led to retraction order and who exactly was involved. 2. Have a series of meetings with the ethical ai team about process. 3 have an understanding of research parameters, what can be done/not, who can make these censorship decisions etc.
2
5
96
New paper from Krotov & Hopfield shows that dense associative memory models are robust to adversarial inputs: arxiv.org/abs/1701.00939
5
35
102
BYOL works even without batch statistics! arxiv.org/abs/2010.10241 surprising result that refutes the critical role of BN as implicit contrastive learning (untitled-ai.github.io/unders…, arxiv.org/abs/2010.00578) so... why does it work?
3
23
98
In spite of the limitations of current generative models, they can create something that really feels like AI Magic! To think I was pretty darn proud of these samples 7 years ago...
Introducing AI Magic Tools Dozens of creative tools to edit and generate content like never before. New tools added every week. Available now: runwayml.com
4
4
94
"a skeleton juggling pumpkins" happy halloween from #dreamfusion!
1
8
96
Great perspective from @BachFrancis Q&A: “For science it’s not number of viewers, it’s not even number of citations, it’s something more complicated. Whether your work tackles important questions… and that can’t be seen simply by a single number or social media.”
2
12
95