science at deepmind, prev: google brain

San Francisco, CA
research is beautiful because for a brief moment in time you know something that no one else in the world knows
35
253
1,791
203,757
Our team at DeepMind is growing (again). 🚀 We're tackling grand challenges in semiconductors, magnets, energy materials, superconductors, and beyond. Join us! Two positions below.
13
43
784
368,141
Today in @Nature, our team at @GoogleDeepMind is excited to share GNoME, a deep learning system that increases the number of stable crystal materials known to humanity by an order of magnitude. From the data, we train a ML force-field with unprecedented capabilities at scale👇
Introducing GNoME: an AI tool that helped discover 2.2 million new crystals. 💎 Crystals are found in everything from the chips powering our phones to solar cells creating clean energy. The model also better predicts the stability of new materials. 🧵 dpmd.ai/GNoME-AI
8
49
348
76,585
🚨 Our team at GDM is hiring a research engineer to work on topics around RL, post-training + materials science! Role is based in Mountain View. DMs open if you have questions.
2
20
196
45,817
🧪🔬 Synthesis experts! Our team at Google DeepMind is hiring a scientist to establish and lead an AI-driven laboratory for materials discovery. The team is working to combine our AI capabilities with automated experimentation to discover novel functional materials. 1/
3
28
133
34,622
Our team at GDM is hiring a software engineer to help us build the future of AI-driven scientific discovery. 🚀 If you want to work with us on scaling scientific simulations, AI agents, and our latest training runs, come join us!
4
11
111
12,537
Yet another NequIP in the top 10 with EquFlash, this time with some very clever accelerations! Bringing the total to ….? I’ll leave it to you how to count :) One question this raises is what a lot of folks have told me recently both on here and in private: they find it “disheartening” (to quote @SamMBlau) that we’ve had the same sota architecture since January 2021 now. My answer is always the same: we’re not building these models for the sake of building models. We’re building them because there are fundamental challenges that require the discovery of novel materials. These algorithms accelerate that. If the FF architecture isn’t the bottleneck, you should stop optimizing it and focus on more interesting problems (data, data, data, evals, scalability, and above all, actually finding and making materials). I can think of at least one other field that flourished when they stopped playing the architecture game. Take my words with a grain of salt though. I was told at APS 2019 by a very “senior” person in the field that the fitting problem of MLIPs was “solved”. That turned out it be horribly wrong. I’m rooting for every grad student to make a meaningful dent in this problem. And who knows, maybe there is more juice to be squeezed beyond a 1mev/atom MAE difference. (also if you’re building molecular FFs, different story, this is a materials benchmark)
5
10
65
20,809
Replying to @HannesStaerk
I raise you:
2
3
30
3,092
Replying to @sorenmind
this is easily solvable for a 5-year old
3
1
30
8,383
More from @brendanbc2:
At @GoogleDeepMind, our world-class team of quantum materials experts, engineers, and AI researchers is using massive-scale compute and AI to revolutionize materials discovery. We're expanding! We are looking for truly exceptional computational materials scientists. 👇
1
29
16,298
Return of the quip! First put on arxiv in 2021…
great news for all the people who over the past year reached out to ask why Nequip and Allegro were missing from Matbench Discovery. they're finally up as of today thanks to outstanding work by @Kavanagh_Sean_ and the MIR group @bkoz37. Leaderboard: matbench-discovery.materials…
1
1
21
3,385
Replying to @HochreiterSepp
Except it's not SOTA on MD17, NequIP is *still* SOTA. They cite long outdated numbers for NequIP. The current numbers show that our equivaraint MPNN NequIP outperforms the Transformer on *every single* molecule in MD17.
1
11
Finally, my favourite example of the zero-shot capabilities of the model is this: the potential has never seen Nickel (the material, it has seen the element) and has never seen a melt. Yet, it does remarkably well on this structure:
1
12
1,927
This is a founding role within a highly ambitious project. The lab will be crucial for synthesizing and characterizing novel materials, validating AI-generated hypotheses, and generating high-quality data to refine our models 2/
1
1
10
2,033
Pretty sweet outcome for a master‘s thesis, Glückwunsch Hannes!
1
10
When evaluated on @jrib_'s MatBenchDiscovery, the model again does well (note that we only eval'ed on this long after the potential was trained, we never explicitly tried to optimize for this).
2
2
10
13,172
Replying to @MicheleCeriotti
Has been a really well-organized conference, the format of junior+senior researchers sharing a talk is a great idea, hope people pick that up. Thank you for organizing!
1
8
555
Wonderful news, again congrats, Boris, so well deserved.
5
521
Inorganic crystal materials power the world around us, from microchips, to solar cells to batteries. Yet, humanity only knows ~48,000 of them. Today, that number is an order of magnitude larger.
1
7
901
Don't sell yourself short, Tim, you've done quite a lot!
1
6
566
Position is based in Mountain View, CA. Note that this is separate from the postings from last week. job-boards.greenhouse.io/dee…
1
7
1,297
Replying to @xiangfu_ml
That author list is squad goals. Congrats, Xiang!
1
7
763
I'm pretty happy with basic html to be honest. Has a tad of a 90s vibe, sure, but if minimalist is what you're going for, might be hard to beat. simonbatzner.github.io/
6
Replying to @ylecun
Maybe heaven is - where the culinary AI is fine-tuned on French data - the car-mechanic AI is fine-tuned on German data - and the AI organizing it all is fine-tuned on Swiss data @ylecun? ;)
1
1
6
By scaling up graph neural networks, we discover 2.2 million novel crystals stable w.r.t. existing efforts. We're super excited to share the 380k on the new convex hull and have them hosted on Materials Project: materialsproject.org/gnome
1
6
845
Replying to @marceldotsci
Only criterion is ambitious people and big, worthwhile goals.
6
Replying to @GabriCorso
We spent hours of our PhDs on this, but one big upside is you learn a ton about your methods from users. For our potentials, we >10x-ed the applications we would otherwise have seen. That experience then shaped research directions b/c we had a better sense of what people wanted.
6
575
Riesen Glückwunsch Melanie!!! Can’t wait for you to join SEAS, more power to geometry+ML!
5
Replying to @HannesStaerk
Lixin Sun, who is straight up amazing! scholar.harvard.edu/lixinsun
5
Replying to @HannesStaerk
Hahahaha die Übersetzung von "Freunde der Sonne", 10/10 Hannes
4
466
Replying to @MicheleCeriotti
Alby and I did these as a post-APS trip in March, absolutely gorgeous! (also was not the worst place to finish a PhD thesis)
5
404
Replying to @rachel_kurchin
Preach! Now try pointing it out to them and see what happens.
1
5
Some more context here from Demis:
Very excited about our progress on materials! Super cool work, come join the AI for Science team.
1
5
976
The resulting data from structural relaxations give us access to a diverse and large pretraining dataset for ML potentials. The resulting potentials exhibits remarkable zero-shot capabilities at scale. First, we find that model performance improves predictably with dataset size.
1
5
836
Amazing news, glad it commutes (although part of me was expecting a vertical Michele). It’s been so fun to follow your work, best of luck for what’s next.
4
392
Macht alles Sinn wenn man es zurück übersetzt aber schon Dichtefunktionaltheorie allein musste ich drei mal lesen als ich es das erste mal gesehen hab.
1
5
Agree with all of Taco's points. Here's a slide I've used in the past to summarize this:
1
4
Replying to @marceldotsci
that should actually work though?
1
4
617
Replying to @jgreener64
Great post Joe, found myself thinking "yes, thank you!" on many of your ML potentials arguments. One point I disagree on is that I believe our field would indeed benefit a lot if stable simulations were a bare minimum requirement for every paper.
2
5
1,586
a lot of smart people are working on better sampling methods than integrating Newton's EoM explicitly. Together I think that will really push ML potentials to be extremely useful in biomolecular applications. 3/3
1
4
265
And obviously papers like this that accelerate the methods meaningfully are incredibly useful and a step in the right direction.
4
1,437
Replying to @Zergylord
Another reason not to cite it :)
1
4
high-energy physics
3
This team is the most talent-dense group of ppl I’ve ever worked with. Join the fun!
1
6
1,204
Replying to @GabriCorso
Congrats Gabriele, well deserved. Really enjoyed reading the work.
1
4
I believe you forgot the "it doesn't matter" option. The validation error is the only thing that will tell you about how well the model generalizes to test data. If the training error is lower in one of your settings, then this simply means the model has ... 1/2
1
2
But keep in mind that e.g. the FCHL19 descriptor also has access to 3-body/angular information. If you read the thread you see that this changed holds against a number of other approaches. 2/2
2
4
Sieht nach einem super Kurs aus, danke fürs Posten.
3
But I believe the most important part is that representation theorists are a bit sloppy and sometimes refer call the linear transformation acting on a vector space V the representation, sometimes by representation just mean that vector space V.
3
Replying to @marceldotsci
There's so many "oh shit that's how that works" moments in there.
3
Great work, Volker! Would you mind sharing how long the 1mn-atom simulation with MTP was run for and how expensive that was in terms of s/atom/force call! Thanks!
2
3
Sure! Some answers below in a long thread + also paging a few senior experts @venkvis, @vl_deringer, @NArtrith who have done work on this. 1. Accuracy of classical FFs is pretty bad. As an example, in [1] we benchmarked a classical FF on ... 1/
1
2
Replying to @marceldotsci
You can also go the other way and call Python (and thereby ASE) from within LAMMPS: docs.lammps.org/Python_call.…
2
3
As @IlyesBatatia correctly points out, it's not understood. Currently, claiming it's only body-order or only equivariance lacks evidence. There are a few reasons for this: 1. The methods we compared scaling laws to in the original NequIP paper *did* include 3-body methods. 1/
1
2
Re scale/speed, I think ultimately it comes down to show-don't-tell. If your method is claimed to be scalable, you also have to do the work and scale it. 9/9
2
Replying to @Ella_M_King
Hm, my guess is that adding milk is a linear mix of T_tea/T_milk, but cooling itself is an exponential decay. So if you add right away, you jump to a more shallow part of the decay curve. Then lose less over the 10 mins?
1
3
We see particularly strong improvements on out-of-distribution robustness and find that even zero-shot is highly competitive!
1
3
791
OVITO is fast, memory-efficient, and easy to use.
2
528
This is great, love the power of exploiting equivariance!
3
In case you had not come across it, this work partially explores your concern: arxiv.org/abs/1902.10811
1
3
Replying to @marceldotsci
Alby did quite a bit of accuracy testing (equivariance etc.) for NequIP/Allegro and found that by keeping F64 and F32 in a few not-performance-critical places, he could get a 2.7x speedup from TF32 without really losing accuracy (arxiv.org/abs/2304.10061)
1
3
455
Agree. In ML terminology, meta-learning would mean learning-to-learn a PES. Here it looks like you're using the labels from one ML potential to train another a second, which would be referred to as distillation, as David points out.
1
3
Finally, a lot of these approaches scale pretty terribly with number of elements (S^k, where S is number of species and k+1 is the body-order), and when you're modelling battery materials you will often have 3 or more species, so you will quickly run into big problems 10/
2
2
Replying to @marceldotsci
Wird gegönnt
1
1
3
Replying to @andrewwhite01
Can endorse the equivariant nets part!
3
Well, that is the big open question: why does the equivariant net learn faster: is it because of equivariance-vs-invariance or because they use vector features + accompanying interactions, like 1 \otimes 1 \rightarrow 0 dot product. 1/2
1
3
General-purpose models are starting to become competitive with ML potentials (here a BPNN) trained *explicitly* on hundreds of structures!
1
3
1,039
As @sschoenholz says, the two are both correct. L(g) here is an arbitrary, not necessarily irreducible representation of a group element of O(3) on the space of functions on the sphere. The change of basis decomposes it into irreps.
1
3
And then when your validation loss stops improving:
2