Co-Founder @ Xaira Therapeutics Former Baker Lab PostDoc at @UWproteindesign. Interested in generative modelling for molecular design. Views my own.

I’m excited to share our significantly-updated preprint on de novo antibody design, where we now demonstrate the structurally accurate design of scFvs (in addition to VHHs) with RFdiffusion! biorxiv.org/content/10.1101/…
6
66
304
24,864
DALL-E’s amazing images are popping up all over the web. That software uses something called a diffusion model, which is trained to remove noise from static until a clear picture is formed. Turns out diffusion models can design proteins too!
32
490
2,201
Today we’re sharing a deep-learning method for protein design called RoseTTAFold Diffusion. With minimal input, it turns prompts (“create a molecule that binds X”) into new proteins that fold and function in the lab. We’ve tested 100s already. PDF here: bakerlab.org/2022/11/30/diff…
7
168
599
We’re really happy to share our preprint demonstrating the atomically-accurate design of single-domain antibodies (VHHs) with RFdiffusion!
6
76
399
45,696
We’re really excited to announce that we’re releasing code for running RFdiffusion! The code is released under an open source license and is free for anyone to use.
Today we're making RF Diffusion, our guided diffusion model for protein design with potential applications in medicine, vaccines & advanced materials, free to use. The software has proven much faster and more capable than prior protein design tools. bakerlab.org/2023/03/30/rf-d…

ALT In this animation, RF Diffusion generates a new protein (orange) that binds to the insulin receptor (blue).

9
95
354
48,206
The code is available both to download from GitHub, and also, thanks to the wonderful @sokrypton, as a Colab Notebook. github.com/RosettaCommons/RF… colab.research.google.com/gi…
6
55
222
63,102
It's so great to see our RFdiffusion paper now live @Nature. This article from @ewencallaway gives a great overview of RFdiffusion and protein diffusion models more broadly, and also highlights some of the ways people are already using it in their own research!
Digital art techniques can now devise custom, working biomolecules on demand. These proteins could form the basis for vaccines, therapeutics and biomaterials. Read the full story: nature.com/articles/d41586-0…
6
34
183
38,983
I'm excited to announce that I'm part of the team at Xaira Therapeutics! This project has been some time in the making. I'm convinced that generative modelling, and ML for biology more generally, will play a pivotal role in the next generation of therapeutics. 1/2
5
5
164
28,749
We designed binders to five medically-relevant molecules. These binder proteins pass our most stringent in silico metrics and we’re testing them in the lab right now. In the future, it might only take a few seconds to design a high-affinity binder protein for any target you want
1
13
95
RFdiffusion is best-in-class for protein backbone generation (low RMSDs to AlphaFold models) and surpasses inpainting and hallucination at scaffolding functional motifs. It makes bigger, more diverse, and more accurate proteins. (600aa protein, gray=design, colors=AF2)
1
5
76
RFdiffusion can also be guided with symmetry. For example, we have designed and are characterizing a new protein that engages all three symmetric ACE2 binding sites on the SARS-CoV-2 spike protein. In this case C3 symmetry works, but any symmetry is possible.
1
7
75
We said last week we were excited to see RFdiffusion being tested in the wet lab. Today, the paper is on @biorxivpreprint, and there's a LOT more exciting experimental data! See David's thread below 👇
We’re very happy to announce that our RFdiffusion manuscript is now on bioRxiv! A lot can change in a week - we’ve now tested over a thousand designs and there’s so much exciting new data! 🧵
1
12
76
Not to mention all the symmetric oligomers @HelenEisenach and @andrewjborst have designed and tested in the lab! Being able to design with any symmetry you want opens up so many applications for therapeutics and enzyme design.
1
7
70
Here at @UWProteinDesign, we’re discovering every day just how powerful RFdiffusion can be. It’s amazing to see technology grow so rapidly. We’re going to make RFdiffusion code available in the near future so everyone can get a chance to design their own amazing proteins!
3
5
65
.@PreethamVi and @SusanaVazTor then used RFdiffusion to design binders to two hormone peptides. These bound too tightly for our instrument to measure! Likely picomolar affinity. That’s the strongest binding to any protein, peptide, or small molecule achieved by computation alone!
2
4
64
Base RFdiffusion is capable, but with tuning and guidance *excels* at hard tasks, such as scaffolding enzyme active sites! Woody Ahern showed that doing this, AF predicts (color) that our designs (gray) fold to atomic accuracy. An exciting future for enzyme design lies ahead!
1
5
57
This work builds on the progress and insights of so many people, both @UWProteinDesign and beyond. It’s such an exciting area to be working in right now! Just today there was this news from @generate_biomed:
Today we introduced Chroma, a generative model that creates new proteins & protein complexes given geometric & functional constraints. It learns to transform unstructured, random 3D shapes into #protein molecules, which can have tens of thousands of atoms. ow.ly/Txn750LShFp
1
1
48
It’s wonderful to see this in print! This was the major part of my PhD work, developing an assay to engineer polarity in cells using de novo designed proteins. There’s so much new stuff in this since the preprint, from the amazing @Lara_K_Kruger – check it out below!
2
9
44
8,341
Very exciting to see this work out! It still feels quite amazing that you can take three functional bits of protein and bring them all together with novel scaffolds, all while retaining their native function. Congratulations to @karla_mcastro for leading this super cool project!
protein design is getting weird ... and increasingly functional ! These are very weird designs ! biorxiv.org/content/10.1101/… fun collaboration with @karla_mcastro @_JosephWatson @jueseph @UWproteindesign
2
46
6,640
Really excited to be giving this talk alongside @DaveJuergens tomorrow! Do join us if you're interested in hearing about RFdiffusion and how it's being used to tackle a broad range of protein design challenges!
Next Tuesday (2/14) @ 4 pm ET we'll have @_JosephWatson talk about RF Diffusion!! Sign up at ml4proteinengineering.com to receive Zoom links in your inbox and add events to your calendar!
10
44
8,125
A great day for science! Congratulations to the teams at @GoogleDeepMind @IsomorphicLabs!!
Announcing AlphaFold 3: our state-of-the-art AI model for predicting the structure and interactions of all life’s molecules. 🧬 Here’s how we built it with @IsomorphicLabs and what it means for biology. 🧵 dpmd.ai/3URDiNo
1
3
44
5,296
This project was co-led by @davejuergens and myself, alongside @naterbennett0, @brianltrippe and @json_yim. And was only possible because of the amazing contributions from @HelenEisenach @woodyahern @PreethamVi @andrewjborst @SusanaVazTor
1
30
I'm extremely happy to see the final project from my PhD published today, in collaboration with the @Oneill_Lab, @RachelSEdgar lab, other members of the @deriverylab and many others! Check out the thread below 👇
2
3
25
4,695
This is a great overview of what's happening in ML-based protein design - do check it out! 👇
Proteins are one of the most interesting applications of LLMs / #AI, but have't seen many overviews since it's moving so quickly. Here's my running notes on the topic and a 🧵with some of my favorite links - lancemartin.notion.site/AI-f…
3
29
4,536
And a special shoutout to @ichaydon for all of the amazing graphics :D
4
24
Congratulations to Susana and the team! This project used an array of different methods (ML-based and not) to address the problem of binding to hormone peptides, paving the way to better diagnosis of several human diseases
So happy to have co-lead this project ! And so excited for this new era of protein design ! Is finally out ! @PreethamVi @definitelyphil @_JosephWatson @DaveJuergens biorxiv.org/content/10.1101/…
1
3
25
A huge congratulations to Susana, Phil, Preetham and the rest of the team! It was such a pleasure being involved in a project that brought together a whole range of approaches and people to make big progress on an important problem!
🎉 Excited to share that our paper "De novo design of high-affinity binders of bioactive helical peptides" is officially out! nature.com/articles/s41586-0… Hats off to coauthors @definitelyphil and @PreethamVi for making this project a reality. 🙌 Grateful for their collaboration!
1
4
22
4,089
We’ve also included a lot of examples to hopefully help get everyone up and running with the code. One notable improvement since we posted the preprint is that the model can now run with significantly fewer steps at inference time, giving (at least) a 4X speedup!
4
2
22
3,591
Congratulations to Nate and colleagues for a really amazing paper 🎉 This work laid the groundwork for so much of the subsequent methods development in binder design!
It’s exciting to see our work on improving binder design with deep learning methods published in Nature Communications! nature.com/articles/s41467-0…
1
1
18
4,961
And also to @BasileWicky @LFMilles @RobertRagotte @jueseph and all the Twitter-less collaborators who made this such an exciting and productive project!
1
15
To our knowledge, this project also details the first designed binders made to targets from protein sequence alone (i.e. no starting structure), with a Hallucination approach developed in collaboration with @JosephRogers100
1
18
Here's a thread summarising the project! nitter.app/naterbennett0/status/1… And here's the link to the paper, out today in @biorxivpreprint: biorxiv.org/content/10.1101/…
We’re excited to share our preprint where we show, for the first time, the atomically accurate design of VHH antibodies!
1
7
13
4,596
The code will be released at some point between now and the final peer-reviewed paper coming out 😊
2
1
12
1,297
This work has been a fantastic collaboration between many people at @UWProteinDesign, @Columbia and @AIHealthMIT. We’re all so excited to see what the wider scientific community can make with RFdiffusion.
10
2,355
Thanks Stephan for the great article - it really nicely describes RFdiffusion in the broader context of ML for proteins!
The past month we have seen some amazing AI news 🤖. But we should be careful not to miss out on what I believe could be one of the most disruptive proteomics technologies this decade. Protein Diffusion could usher in a new Protein Design Era. stephanheijl.com/rfdiffusion…
1
1
8
Replying to @BioExplorr
Thanks!! My guess would be it'll be made available in the next few months (we're super committed to making this available as soon as reasonably possible!). A Colab is also a fantastic idea :D
1
10
This is so awesome! I can't wait to see what else you go on to make 😄
Colab版のRFdiffusionで設計したde novo designタンパク質の結晶構造解析を試してみたところ、予測とよく一致していました。 発現精製から構造解析まで学べるので、来年の学生実習ではde novo タンパク質設計から構造解析までをテーマにしようと考えています。すごい時代。。 colab.research.google.com/gi…
1
9
2,151
RFdiffusion was built on great collaboration. I co-led the project with @DaveJuergens @naterbennett0 @brianltrippe @json_yim @HelenEisenach and @woodyahern , at @UWproteindesign, @AIHealthMIT & @Columbia. So many others contributed and are continuing to use and develop the method
1
8
598
Thanks, as always, to David Baker, and to all the coauthors who helped make this project a reality @dejsee, Connor Weidle, @wanderingriti, Ellen Shrock, @definitelyphil, Buwei Huang, Inna Goreshnik, Russell Ault, Kenneth Carr, @SingerBenedikt, Cameron Criswell...
1
6
1,299
Replying to @chrisfrank662
Thanks! So RF scales ~quadratically with protein length, so for an 1000 amino acid it takes about 30s per diffusion step (and hence, for a 200-step trajectory ~1.30h, as opposed to ~2 minutes for a 100aa protein) on an A4000 gpu. 200 steps is probably more than you need though!
7
This work *does not* solve drug design, but it shows for the first time that accurate antibody design is possible. Hopefully this work will lay the groundwork for future developments. I hope one day that making therepeutic antibody will be as easy as pressing “go”!
1
6
661
And a huge thanks to the editors @Nature, and the thoughtful reviewers who helped shape the work presented today!
1
4
803
Thanks for the nice post! We'd be very keen to let you know how Twist is already enabling our ongoing RFdiffusion developments. Do DM me if you'd be interested in chatting!!
6
429
Replying to @proteincapsid
It should run fine, at least for small proteins. I just made this 100 amino acid protein on my local computer (no GPU) in 5.30 minutes. 300 amino acids would take around 50 minutes. It's obviously quite a lot faster with a GPU (available on Google Colab)
1
6
358
Replying to @notresz
Yes! ProteinMPNN takes a protein backbone (N-Ca-C-O) atoms and finds an amino acid sequence that would fold to that backbone structure (and it's very good at it!). RFdiffusion instead makes the protein backbone, which we then feed to ProteinMPNN :D
6
This is great fun - congrats!
1
4
217
A special mention and thanks to @JosephRogers100, with whom I worked very closely on the peptide binder Hallucination part of this project. Those binders were the first validated proteins I ever designed, and I'll never forget that "oh my word it actually works" moment!
2
671
Replying to @Jiaxing_Tan_
The hotspot input is chainletter-residue_number (rather than AAidentity-residue_number). This case seems to work fine🙂 if you provide an email or something I can send you the submission script I used. We'll also make this clearer in the README!
1
4
132
... Dionne Vafeados, Mariana Garcia Sanchez, Ho Min Kim, @SusanaVazTor and Sidney Chan. Thanks also to. @ichaydon for making the VHH diffusion graphic! And thanks to everyone at @UWproteindesign for making projects like this possible. Great collaborations are so good for science!
3
1,216
In this work, we delineated a mechanism through which cells buffer the intracellular availability of water in response to acute changes in cellular water availability. It was a real joy to work on such a collaborative project!
2
626
It's been so exciting moving to Seattle and getting involved in this really exciting project! Thanks to everyone involved!
New preprint! We came up with 2 methods to design de novo scaffold proteins to hold arbitrary functional motifs: hallucination (optimizing a seq against predictions of RoseTTAFold (RF)) and inpainting (recover masked regions of seq+struc.) 1/n biorxiv.org/content/10.1101/…
2
I spend a lot of my time thinking about proteins, but it's important to remember that protein folding and function critically depend on the solvent (water) in which the protein is dissolved.
1
2
706
This project grew out of discussions with my friend and co-lead author @naterbennett0 last year. As a reminder, last year we published and released RFdiffusion:
Today we're making RF Diffusion, our guided diffusion model for protein design with potential applications in medicine, vaccines & advanced materials, free to use. The software has proven much faster and more capable than prior protein design tools. bakerlab.org/2023/03/30/rf-d…

ALT In this animation, RF Diffusion generates a new protein (orange) that binds to the insulin receptor (blue).

1
3
1,584
Really glad you're using the code! So to be clear, here, you're designing the cyan peptide? I think it's reasonably likely AF2 (non-multimer) wouldn't predict these well - I think it can struggle with small peptide sequences
1
3
535
These binders typically bound via rigid secondary structure-based interactions, however, which contrasts to how nature has “solved” the binder design problem. Nature uses antibodies to bind to targets, which interact with proteins through more flexible CDR loops.
1
2
770
Success rates in the lab have been high across a range of tasks. I'm very excited to see how it works for you!
1
420
I 100% agree! The sleeper is so great, and incredibly convenient, but since the upgrade it's just way too expensive to get a bed. Shared rooms would still be much more comfortable than the seats!
2
61
So glad to have been a (tiny!) part of this project. It's so exciting seeing RoseTTAFold, and its fine-tuned variants, used for new applications!
Excited to share our work on zero-shot mutation effect prediction using RoseTTAFold! Thanks @minkbaek, @DaveJuergens and @_JosephWatson for being amazing!! collaborators. Preprint here: biorxiv.org/content/10.1101/…
2
RFdiffusion excels at protein interface design (making a binder to something). We demonstrated very high experimental success rates in the paper, and across the Institute for Protein Design scientists were making binders to historically extremely challenging targets.
1
2
837
The question was though, would these antibodies actually work? @RobertRagotte and @AndrewJBorst, co-lead authors on the study, led the experimental effort. We designed VHH binders to four unrelated targets. A structure of one of them shows that it binds almost exactly as designed
1
2
669
Replying to @ozalabCP
I see these as separate problems really. If you already have an antibody/VHH, optimising it with ML/experimental methods is a very good idea. Our work designs VHHs to sites to which you don’t have one. For example, the TcdB site doesn’t have an antibody/VHH, but we made one 😊
1
1
311
Replying to @danofer
Good question! So the input is actually protein backbone 3D coordinates, which we parametrize as C-alpha (x,y,z) coordinates and a residue (backbone) orientation. The output is then also 3D backbone coordinates. We then generate a sequence with ProteinMPNN
1
120
Replying to @proteincapsid
We haven't tried this, no. It's a good idea - we can look into it.
1
35
Antibodies have many advantages as therapeutics, and are as such the biggest class of therapeutics globally. However, to date, design of structurally accurate novel antibodies has not been demonstrated. We wondered if generative models like RFdiffusion could be the answer.
1
1
694
Replying to @Eyesgack
Great question! This is possible, and is described in the README (with the `provide_seq` input)
1
1
108
Replying to @Eyesgack
We actually put this into an accompanying manuscript, which includes both some cool new RFdiffusion advances (partial diffusion and peptide binder design), as well as other methods. The binders work really well! biorxiv.org/content/10.1101/…
2
1
243
Replying to @Eyesgack
No, it should be possible to run everything without Rosetta!
2
1
341
Replying to @LeoChan213
From the outset, we wanted RFdiffusion to be computationally tractable to run, so, given that each step is quite costly, we limited to 200 steps during development. But benchmarking (to be shown in the final manuscript) then showed we can use just 50 with no performance drop!
1
153
Replying to @healthuniverse_
My bad! I'll email Dan.
1
59
Replying to @ozalabCP
Yes, for clarity, in this work we are only designing CDRs. We keep the VHH framework fixed and just design new CDR loops to interact with a user-specified epitope
1
1
97
Replying to @ozalabCP
Yep, these will be released (just working out the logistics etc)
1
70
Replying to @Robot83821931
This is a great question! It's an area that we're really interested in, and one that others (including Tommi Jaakkola, a supervisor on this project) has worked on (arxiv.org/abs/2210.01776). Currently it's not possible (at least no explicitly) to do 1/
1
1
We therefore built upon RFdiffusion to train a generative model specifically for antibody design. This model is capable of designing diverse and truly de novo antibodies.
1
1
2
730