Foundation Models for Generalizable Autonomy in Robotics. Reinforcement Learning. Faculty in AI Robotics @GeorgiaTech. Prev @nvidia @UCberkeley @stanford

San Francisco / Atlanta
Physical AI is making a classic Moneyball error by mis-pricing data. Optimizing for "cumulative operational hours" or relying on low-variance production telemetry is the robotics equivalent of chasing batting averages, when we should be isolating on-base percentage. Unlike text, every useful hour of robot data is paid for. If you aren't calculating marginal utility per dollar based on data novelty and intrinsic dimensionality, you're allocating capital based on vibes. In physical AI, useful data is strictly resource-constrained, shifting the goal from maximizing compute efficiency to maximizing marginal loss reduction per dollar. A framework breaking down: - Interventional vs. Observational channels - Why deployment telemetry behaves like a decaying oil well - The "Convergence Trap" of narrow commercial niches - Metrics to replace raw operational hours check out the full post here: open.substack.com/pub/praxis…
11
21
224
18,249
As @NeurIPSConf #NeurIPS2021 deadline approaches, I've consolidated some technical writing tips. Succinct answers to each of the questions, Voila, you have a crisp Introduction! Expand on prior work, intuition/tech details, & expts for the rest ensuring the why is always clear
12
196
925
We organized a seminar-style course @UofTCompSci on 3D and Geometric Deep Learning. Here is the reading list (Videos + Slides): pair.toronto.edu/csc2547-w21… Each paper comes with a 10-min tutorial: piped.video/channel/UCrsmAXn… Hope it helps folks looking to get up to speed on the topic!
9
166
735
Career update: I will be moving to @ICatGT next year. I look forward to working alongside the exceptional researchers at @GTrobotics and @mlatgt research.gatech.edu/animesh-…
54
24
725
The standards of what ⁦@Twitter⁩ thinks is machine learning have dropped somewhat. Yet for some reason 1k+ people like a thread on definition of log. ...and here I am tweeting "excited to share our latest paper on....." I prolly need education on HowTo Twitter
61
34
507
Replying to @j_foerst
I got the same take away from Dave Patterson Think of career in terms of long shots, and you only need 1 or 2 of them to succeed in your career. Systems people at berkeley have grokked it and are having a great run of success with this mantra! piped.video/watch?v=TK6EPvrm…
3
44
494
59,393
Why is credentialed gatekeeping still a thing? @dwarkesh_sp has gained more technical insight, during his deep preparations for the breadth of speakers he meets, than most people would have working full time! And yet, he almost always admits not being the expert in each of his podcasts. Experience is more valuable than pedigree!
I’m intrigued by this Dwarkesh guy but what are his qualifications? E.g. has he done research at a top school like MIT?
30
11
431
98,160
Model-Free Reinforcement Learning (MFRL) has been alluring, especially with supercharged compute with physics on GPU. However, the methods use 0-th order gradients, and are often not the best optimizers. Can we do better than PPO in continuous control for robotics? Turns out yes! 🥳 tl;dr: Faster, better RL than PPO in continuous control 💪 adaptive-horizon-actor-criti… The answer lies in using more information from the simulation. We are juicing the simulation on GPU as it is, why not use it for gradients as well? This has been a driving question in a series of our works. We first studied this problem in ICLR 2022 paper on Short Horizon Actor Critic short-horizon-actor-critic.g… Naive gradient based methods are stuck in local minima and have exploding/vanishing gradients. SHAC solved this problem truncated rollouts and model based value estimation, where the model is Differentiable Sim. This boosted sample efficiency and wall-clock time immensely especially in high dimensional systems such as humanoids Yet, given enough compute PPO often caught up. Our follow up paper on on Adaptive Horizon Actor Critic at ICML 2024 discovers the cause and provides a fix. adaptive-horizon-actor-criti… However, we find that even when given ground-truth dynamics, not all gradients are useful due to sample error. 1st-Order Model-Based Reinforcement Learning methods employing differentiable simulation provide gradients with reduced variance but are susceptible to bias in scenarios involving stiff dynamics, such as physical contact. We find that back-propagating through contact and long trajectories drastically reduces gradient accuracy. Using this insight, we propose AHAC to dynamically adapt its roll-out horizon to avoid differentiating through stiff contact. AHAC is a first-order model-based RL algorithm that learns high-dimensional tasks in minutes (wall clock) and outperforms PPO by 40%, even in the limit of data provided to PPO. This work is led by @imgeorgiev alongside @krishpopdesu, @xujie7979, @eric_heiden and ample assistance from warp team at @NVIDIARobotics (@milesmacklin)
8
67
396
52,292
Ever wondered if we solved RL in continuous time, would it be better for robotics. We find in many cases, Yes!
 Value Iteration in Continuous Actions, States, and Time arxiv.org/abs/2105.04682 @_mlutter @MannorShie @Jan_R_Peters Dieter Fox @icmlconf #ICML2021 @NVIDIAAI
5
53
361
Kinda definitive guide to FAQs for students evaluating Ph.D. offers! Always ask: Are there snacks in the lab? Advisors: need to invest in munchies! thx to @igilitschenski for sharing this! Original: cs.columbia.edu/wp-content/u…
6
65
328
My thoughts on the Optimus Effort at AI day @Tesla I made some notes on my takeaways. docs.google.com/document/d/e…
front row seats at AI day at Tesla remark 1 setting expectations for optimus follow along..for more
26
41
304
Nvidia Robotics Research internships for summer 2024 are now open If you are interested in learning for control, RL, and especially Foundation Models: VLM and LLMs for decision making. DM me for more and apply here nvidia.wd5.myworkdayjobs.com… BTW, these are located in both Santa Clara, CA and Seattle, WA (which is beautiful in the summer) More about the group here research.nvidia.com/labs/srl…
13
35
314
63,368
How can robots reliably place objects in diverse real-world tasks? 🤖🔍 Placement is tough—objects vary in shape and placement modes (such as stacking, hanging, and insertion), making it a challenging problem. We introduce AnyPlace, a two-stage method trained purely on synthetic data to predict diverse placement poses of unseen objects for real-world tasks. Read on for more👇
4
36
297
25,368
"Physically Embedded Planning Problems: New Challenges for Reinforcement Learning" arxiv.org/abs/2009.05524 the lack of reference to robotics folks working on this for decades and reinventing the problem as your own! gotta do better @DeepMind How "New" is this?

ALT Robert Downey Jr Face Palm GIF

2
42
239
Every wondered if we can model motion as a language? can we tokenize this new language? is it useful? Turns out tremendously! 🚀 In out latest #NeurIPS2024 paper on QueST: Self-Supervised Skill Abstractions for Learning Continuous Control, we find that action tokenization matters a lot! We can learn skill encodings by representing temporal action abstractions with a discrete codebook. This enables 2 things 1. Better Behaviour Cloning: we can better assimilate multi-task data (>9%) over best paper. This is currently best in class BC method! 2. generalization of this language to represent new tasks in 5-shot transfer to longer horizon tasks! Check out the thread by @MeteAtharva for more details. nitter.app/MeteAtharva/stat… And check out more details at: quest-model.github.io/ Joint work with @MeteAtharva @albertwilcoxiii @Haotianxue_GT @YongxinChen1 @ICatGT @mlatgt @GTrobotics @NVIDIARobotics
Excited to share our #NeurIPS2024 work on QueST: Self-Supervised Skill Abstractions for Learning Continuous Control🦾🤖 QueST is a multitask latent-variable model that learns sharable low-level skills and outperforms Diffusion Policy, ACT and VQ-BeT by >13% in 5-shot transfer.🧵
12
37
241
26,225
Isaac Gym - @NVIDIAAI physics simulation environment for reinforcement learning research (preview Release) - End-to-End GPU accelerated - Isaac Gym tensor-based APIs fo massively parallel sim buff.ly/3myqp6O Also get in touch for potential internships to flex in Gym!
7
37
230
Sad, but true! @drfeifei - "@stanfordnlp has 64 GPUs"! is that really the case @chrmanning and @percyliang ? Silver lining: the "research impact per-gpu" metric from Stanford is order of magnitude larger than tech! Public Sector needs to be given more resources for innovation. We really need AI to be decentralized like the internet!
Fei-Fei Li says Stanford's Natural Language computing lab has only 64 GPUs and academia is "falling off a cliff" relative to industry
16
20
227
131,962
It is disheartening to see American leaders exact harsh standards on Chinese humanoid players, while at the same time pushing futuristic narratives with polished choreographed videos. Ironic! The opportunity in humanoid robotics has never been greater. And the market is HUGE! We need multiple successful players for this future to come to fruition. Currently the technology is nascent and most perceived competitive leads among the players are small in the larger scheme of things. The space is still young and there is a lot to do.
Look at the reflections on this bot, then compare them to the ones behind it. The bot in front is real - everything behind it is fake If you see a head unit reflecting a bunch of ceiling lights, that’s a giveaway it’s CGI
22
7
226
35,034
Reviewers in ML continue the trend! S4RL is shot down for being too simple and obvious! arxiv.org/abs/2103.06326 All Reviewers "great work, strong empirically in many domains, but there is little fancy-pants theory!" reject🚫 they missed the memo: Surprisingly Simple!
Hot take: deep RL research has stagnated because conferences have created bad incentives, rewarding researchers for vacuous claims of novelty, tenuous-at-best theoretical connections, or SOTA, while punishing boring analysis of the empirical tricks that actually make things work.
6
15
201
Semi-supervised Learning to generate high-resolution images with disentangled latent codes using less <2% labelled data Paper: buff.ly/39JS6U0 w\ @wn8_nie, T.Karras, @shoubhikdn, A.Patney, A.Patel, @AnimaAnandkumar Real win: can make your advisor happier (@drfeifei)
2
54
204
No wonder that genuine EB1 applications are backlogged because a bunch of people figured out how to game the criteria @USCIS needs to refer to publication metrics from Google scholar metrics
8
11
203
58,701
Our work on Neural Task Graphs is accepted at CVPR 2019 as Oral 😁 ⁦@drfeifei⁩ ⁦@deanh619⁩ ⁦@danfei_xu⁩ ⁦@yukez@jcniebles⁩ ⁦@StanfordSVLarxiv.org/abs/1807.03480
4
26
200
Our new paper on Neural Task Programming: Learning to Generalize Across Hierarchical Tasks j.mp/2y8qYOS
3
65
199
Super cool new paper from the @NvidiaAI Robotics group in Closing the Sim-to-Real Loop: Adapting Simulation Randomization with Real World Experience [Paper buff.ly/2pWMLUg , video:buff.ly/2pWqXbq ]
2
75
201
Object-oriented world models is *the* key for reasoning. But, unsupervised task-agnostic methods are hard! SlotFormer, at ICLR2023, is an unsupervised video prediction model that also works for tasks: VQA and model-based planning slotformer.github.io/ Read on for more!
6
23
180
55,927
Dexterous robot hand design is arguably one of the harder roadblocks in the way of humanoids. most hands are either too big or underactuated. and the ones that have high DoF, are only kinematically useful, but have little power to even open a bottle of coke! This new hand design from DexcelRobotics looks exemplary! The world is indeed moving fast! How exciting.
A new dexterous hand is here. DexcelRobotics, a startup founded by a former core member of Tencent Robotics X, has launched its first product, the Apex Hand. The company claims it's the first in the industry capable of operating a cell phone with a single hand. The Apex Hand is a well-rounded performer, with a focus on real-world application. Here are its key specs: ► Degrees of Freedom: 21, which the founder notes is enough to replicate most human hand functions. ► Strength: A single-finger force of ~2.5kg and a vertical lifting capacity of ~30kg. ► Speed & Precision: A response time near human-level, with positioning accuracy of ≤0.1mm. ► Robustness: Can withstand unexpected impacts and maintain stability. ► Tactile: Features self-developed flexible electronic skin with a sub-millisecond communication delay and >1000Hz refresh rate. The founder explains that while academic research often focuses on single-point breakthroughs, a commercial product needs hardware stability, data handling, and model capabilities. The company’s full-stack development experience and modular design are key to bringing this technology to a broader market, starting with semi-structured industrial environments before reaching homes.
5
26
173
19,581
We are co-organizing #NeurIPS2021 workshop on Differential Equations and PDEs. Promises to be a very exciting agenda on a topic increasingly getting popular among ML folks. Sept 17 deadline for contributed papers stay tuned for more!
Our 'The Symbiosis of Deep Learning and Differential Equations' workshop has been accepted for #NeurIPS2021 ! Send us your work on data-driven dynamical systems, neural differential equations, solving PDEs with deep learning etc. Tentative submission deadline Sept. 17.
1
26
167
We just won the Best student paper at #RSS2021 @RoboticsSciSys 🥳🎉🍾 Check it out here: diff-cutting-sim.github.io Video: piped.video/bN4yqHhfAfQ Time to cut some cake!

ALT Slicing Cutting GIF

Very honored that DiSECt won a Best Student Paper Award at #RSS2021! Congrats to my co-authors @milesmacklin, Yashraj Narang, Dieter Fox, @animesh_garg and Fabio Ramos, and thanks for this great collaboration @NVIDIAAI! 🎉
10
10
162
Deep RL is not really using deep networks! We found that dense connections and deeper networks help improve learning performance. Results hold across various manipulation and locomotion tasks, for both proprioceptive and image observations! sites.google.com/view/d2rl/h…
Deep learning has seen huge gains when you increase the number of layers, but what about Deep RL? Introducing D2RL! Changing how you parameterize your policy + Q function boosts performance Co-led with @mangahomanga. @AravSrinivas @animesh_garg Link: arxiv.org/abs/2010.09163
1
30
156
Sometimes you gotta give it to TAs. The guy below has a plan for doing a discussion section in case of a nuclear attack, COVID19 seems less harrowing! "No matter what happens, I will not give up on you" 🤯
Sounds like Berkeley instructors are handling #COVID19 well 😂
4
10
155
32 for a single conference! And 13 accepts 😱 @svlevine you are making everyone else look like they are not even working
#ICLR2020 authors with > 5 submissions: 32 Sergey Levine 20 Yoshua Bengio 16 Cho-jui Hsieh 14 Pieter Abbeel 13 Liwei Wang, Tom Goldstein, Chelsea Finn, Bo Li, Jun Zhu # of accepted papers: 13 Sergey Levine 7 Le Song, Jun Zhu 6 Cho-jui Hsieh, Jimmy Ba, Liwei Wang, Pushmeet Kohli
3
17
157
This needs to be said out loud, particularly by gatekeepers of AI/ML academia (Profs, advisors, mentors, reviewers, SACs, ACs) Novelty for the sake of it is not a virtue! Many students trying to be different for the sake of novelty and often sacrificing utility in the process!
I still remember I marveled at how simple and elegant the Lucas-Kanade tracker is when I first learned about it as a grad student. Here is a fun story about it. "Newness itself is not a virtue, usefulness is." - Takeo Kanade
5
21
151
The countdown for @iclr_conf starts today! Timely with @NeurIPSConf rebuttals underway. The PC @iclr_conf this year is introducing a new Reciprocal Reviewing Requirement - All authors (individual on >=3 papers) must individually provide service as a reviewer for >=6 papers (or help with organization/SAC/AC) - all submissions must have at least one author who is registered to review >= 3 papers It is a fair way to enable reviewing at scale without penalizing for high productivity or explicitly putting upper bounds on submissions. Also no more gymnastics with latex! ..main text must be between 6 and 10 pages (inclusive) with suggested length at <=9 Authors should write the papers they need to write - including shorter ones, no need to fill the page limits. It is great to see innovative initiatives to explore scalable models to maintain quality of reviews as the ML community grows. SPC: @cvondrick + PC (@feishaAI @yuqirose @VioletNPeng @animesh_garg)
Announcing the ICLR 2025 Call for Papers! Abstract submission: 11:59pm, Sept 27 (AoE) Submission date: 11:59pm, Oct 1 (AoE) Reviews released: Nov 12 Author/Reviewer Discussion: Nov 12-26 Author Last Day to Reply: Nov 27 Final Decisions: Jan 22 2025 iclr.cc/Conferences/2025/Cal…
9
20
155
50,964
Interested in how "Deep Learning meets Differential equations"? Join us for @iclr_conf workshop on Apr 26. buff.ly/2JUuGjJ Streaming and discussion platform details to be posted on workshop website. @AnimaAnandkumar @TanNguyen689
2
27
150
Every once in a while authors are bold enough to break the mold and surprise reviewers. a 4 page paper on the effectiveness of patches in deep learning! tl;dr: we had an idea and it works damn well! openreview.net/forum?id=TVHS… grabbing🍿to see reviews!!
openreview.net/forum?id=TVHS… A very ... interesting 4 page paper at ICLR. I'm curious to see the reviewers' reactions.
1
17
138
Exciting week in robot manipulation! After the Berkshire Grey IPO, Google deems that its effort is ready to move out of X. Intrinsic, a new alphabet entity to focus on industrial robotics. Rather exciting time to be working in robot manipulation. 🦾 blog.x.company/introducing-i…
2
13
130
Learning Causal Graphs that capture Physical Systems has high potential yet challenging! Check out End-to-End Causal Discovery from videos Site: yunzhuli.github.io/V-CDN/ Paper: arxiv.org/abs/2007.00631 w\ @YunzhuLiYZ @AnimaAnandkumar, A.Torralba, D. Fox

ALT Discovery of a Causal Graph from a Single Video Instance of a Deformable Object (a Shirt in this case)

2
32
133
NeurIPS 2019 Paper IDs are already over 8900! 😱
5
23
132
Robot learning from BC based models has really come a long way, and continues to surprise me. The complexity of real world tasks achievable with BC based methods on real data sidesteps the challenges in RL, particularly the sim2real aspects! The new release of ALOHA Unleashed from Google DM videos demonstrates multiple complex tasks, all autonomous with reasonable speed, and most importantly realistic appearing trial and retrial behavior -- aka closed loop policies. It is indeed exciting to think how far this paradigm can be pushed, and for all it's criticisms (mind you I am one of it's critics), it will definitely push the envelope or the t-shirt in this case!
2
18
131
27,712
This summer, I've picked up my long-form writing again on my blog at praxiscurrents.substack.com First up is a perspective on foundation model in robotics. I call it the age of empiricism in Physical AI buff.ly/XbZNzes Feel free to visit and subscribe for regular updates.
4
11
130
9,168
Perks of working at @NvidiaAI ;) arguably the most kick-ass simulator is in-house and now for everyone else as well. Going to be at =#NeurIPS2018 in case you are interested in chatting about opportunities in ML, Vision, and Robotics.
At #NeurIPS2018 we announced that PhysX, the world’s most popular physics simulation engine, is now open source. Robotics researchers can easily train machines in realistic environments. nvda.ws/2BK92vk
4
24
123
Value Gradient weighted Model-Based Reinforcement Learning at #ICLR2022 Blog: pair.toronto.edu/blog/2022/v… Paper: arxiv.org/abs/2204.01464 📢 Poster: Wed Apr 27, 10:30 am PST iclr.cc/Conferences/2022/Sch… @c_voelcker V. Liao @SoloGen @UofTCompSci @VectorInst for more 👇
3
20
124
CNNs are biased towards high-frequency textural information. New work on fixing CNN's over-reliance on texture through a curriculum that exposes texture slowly. Results in better features that generalize both to new datasets and to new tasks. Paper: arxiv.org/abs/2003.01367
If you’re looking for a paper for your Friday readings, maybe check out our new work! Joint work with @hugo_larochelle & @animesh_garg! If you want to improve your CNN but don’t want to add trainable parameters or add any regularization loss, give our paper a try! 🙂
1
22
123
A leading car company makes a humanoid robot that appears to be ahead of its times, and hopefully rallies many researchers to work on the problem. This is ground breaking ...only that this tweet is a few years too late spectrum.ieee.org/honda-asim…
10
7
114
The Path to Autonomy is paved with Simulation! The focus on building the best data engines for Embodied AI & Robotics has been a core mission at @NVIDIARobotics I am so excited to share that the simulation effort we have been working on for the last 2+ years will be part of the bigger scheme of Omniverse & Robotics efforts @nvidia 🎉🪩Isaac Orbit is now Isaac Lab! The ease of use of the current framework powers a lot of training and data-generation pipeline for the Project GR00T. If you have worked with Orbit, you will see a lot of familiar visuals from Jensen's keynote and more! We created Orbit with the mission to democratize robot learning and enabling developers through a batteries-included framework to jumpstart development. The mission continues with Isaac Lab to build a lighter, faster and intuitive robotics framework built on Isaac Sim.
6
9
115
22,517
I met Jensen in my first week at Stanford, and admittedly, did not realize the long game he was driving Nvidia towards! Indeed the world has come a long way since but Jensen maintains his conviction to drive AI at @nvidia and broadly.
@nvidia CEO JensenHuang delivered the world's 1st AI supercomputer DGX-1 today to SAIL! @jcniebles @silviocinguetta
2
4
119
20,955
Unitree G1 Humanoid Price from $16K G1 has 23-43 joints and offers extra large joint angle limits Notably, it has really usable 3 fingered hands in addition to anthromoporphic hands with Force control (in some variants). The demo is impressive in so many aspects as it is, but points of note 1. the robot is smaller than a human 2. light weight that a human could lift! and even folds into a small volume! 3. hands doing both high force and high precision tasks. Humanoids at this price point speed up the development from academic labs, as well as add a sense of urgency among other Humanoid companies to improve the pricing of their hardware platforms. This is so exciting! Source: Jason Wong (@UnitreeRobot007) @UnitreeRobotics
7
17
118
42,708
This is so great to see exceptional researchers deciding to switch into robotics! This is a sign of times to come - when talented people start to find a problem interesting, the field accelerates like crazy!
Booting up my lab’s new crew. 🤖 I’m betting big on robotics and a foundation model for generalist policy. Why? With the blooming of foundation models, useful, reliable physical intelligence is within reach – one policy composing skills across robots and homes, with safety guarantees and predictable cost. The unlock isn’t a trick, it’s a stack: • Data pyramid & governance – teaches long–horizon behavior; respects privacy and licensing. • Models & algorithms – unify perception, planning, control with test–time alignment and risk–aware planning. • RoboEval – standardized, scalable evaluation that rewards reliability under shift, not cherry–picked demos. • Systems – sim→real pipelines and policy→robot deployment at scale. 🛠️ UMD is positioned to set that stack – and the rules – in the open, with standards for data, safety, and evaluation that everyone can build on. ✨ Can you imagine a world where your household robot does your laundry, cooks your meals, cleans your home, and bartends your house party? 🧺🍸 #Robotics #EmbodiedAI #PhysicalIntelligence #UMD #OpenStandards
1
3
114
23,116
What does the video remind you of the most? oh! I know... (deep) Reinforcement Learning papers proposing new algorithms #SOTA
4
6
100
Only a matter of time when we had legs with wheels on biped systems! Also dual battery system for self-swaps And it is not just CGI, they have built it! Well done @HexagonAB Pretty remarkable how fast the hardware is iterating in this space! Links in 🧵
5
16
105
7,971
Proud moment for members (and alums) of ⁦@StanfordSVL⁩ upon the election of ⁦@drfeifei⁩ to National Academy of Engineering Congrats Fei-Fei! nae.edu/19579/31222/20095/22…
12
104
front row seats at AI day at Tesla remark 1 setting expectations for optimus follow along..for more
5
9
99
GPU-poor no more! Exciting to see support for open source and academic entities with compute. @natolambert @HannaHajishirzi @RanjayKrishna
With fresh support of $75M from @NSF and $77M from @NVIDIA, we’re set to scale our open model ecosystem, bolster the infrastructure behind it, and fast‑track reproducible AI research to unlock the next wave of scientific discovery. 💡
4
3
105
11,726
Got the joke too late Don’t follow the MIT guy as much
1
104
6,086
Teaching is hard, they said! have to prep exams, they said! then this....
4
1
100
I am beyond excited to be building the PAIR group at @ICatGT working on ML and Robotics. @mlatgt & @GTrobotics is a nexus that will help us create the next wave in AI Robotics. Thanks Nathan Deen for the kind praise.
The next 10 to 12 years will see a boom for advancement in robot manipulation, says @animesh_garg. The new assistant professor has chosen @GeorgiaTech as the place he wants to be during this crucial time. @GTrobotics @mlatgt b.gatech.edu/3szyPmI
2
10
102
17,490
I talked about Structure in Reinforcement Learning for Robotics at #ICRA2022 workshop on Behavior Priors tl;dr: Structured Biases in DL improve both efficiency & generalization. Robot Learning / RL needs new ones! Slides: animesh.garg.tech/assets/pdf… Video: piped.video/watch?v=5u2cGaxF…
1
14
94
I am increasingly excited that robotics and manipulation, in particular is enjoying the attention of broader AI community. @JitendraMalikCV said this "Robotics is far too important to be left to roboticists" @corl_conf has increasingly become the intellectual home to folks not only from robotics but also from vision, machine learning and also language. Ideas coming from all perspectives are both useful, and often unblock us because an outsider doesn't always start with the intellectual baggage that an old-timer might. we are in for progress at Light Speed in manipulation. All aboard! 🚀🤖🦾🧤
Lots of memorable quotes from @JitendraMalikCV at CoRL, the most significant one of course is: “I believe that Physical Intelligence is essential to AI” :) I did warn you Jitendra that out of context quotes are fair game. Some liberties taken wrt capitalization.
2
3
97
14,387
Exciting new work on Neural Task Graphs: Generalizing to Unseen Tasks from a Single Video Demonstration buff.ly/2N22hYv @deanh619 @YukeZhu @danfei_xu @silviocinguetta @drfeifei
2
20
99
Our new work at #Corl2019 will present RL with Ensemble of Suboptimal Teachers -aka- specify as much as you can easily, let learning handle the rest. Blog: buff.ly/2O99xWr Paper: buff.ly/2NY9KeU w\ @andrey_kurenkov, A. Mandlekar, @RobobertoMM, @silviocinguetta
1
26
96
Official announcement of the @UofTCompSci Toronto robotics research cluster at @UTM and part of @UofTRobotics. @BurgnerKahrs (Continuum Robots), @florian_shkurti (Mobility), @animesh_garg (Manipulation) Hiring students, postdocs, engineers - reach out! buff.ly/2KNNjWy
5
18
93
Thesis defense of my mentee, and then close friend @AjayMandlekar Take home: Robot intelligence needs supervision at scale. And it's not just an algorithms problem, equally a systems problem! ⭐️-committee: @drfeifei @silviocinguetta @EmmaBrunskill @chelseabfinn @DorsaSadigh
2
13
95
We have been running an AI in Robotics (AIR) Reading Group. PIs have so many talks, while students are in background. AIR is a platform for students to network across their immediate network and get speaking experience. For the students, by the students pair.toronto.edu/robotics-rg…
2
19
92
please can we have this in Vision (@CVPRConf , @ICCV_2021 ), ML (@NeurIPSConf , @iclr_conf, @icmlconf), & Robotics (@corl_conf @RSS_Foundation) Ideas where to start the petition, @gneubig
ARR is now accepting submissions! Please see aclrollingreview.org/authors for an overview of the submission form and link to the submission site. Submit by 5/15 to be eligible for @emnlpmeeting ! #NLProc
3
11
88
We have been working on conditional video generation for a while and even have a paper in the upcoming CVPR however the results in this paper are just amazing! Now I can go revive my dream of being a TikTok celeb!
CogVideo: Large-scale Pretraining for Text-to-Video Generation via Transformers github: github.com/THUDM/CogVideo
1
5
86
I am very excited about the potential of high perf simulation in enabling AI Robotics. Looking forward to providing an overview of things we have been developing in the last couple of years, and a preview of things to come! Come check it out: Nov 8 2:30pm EST.
Could simulation frameworks provide faster more accurate learning for industrial robots? Find out from Prof Animesh Garg on Nov 8. Full schedule & free registration eventbrite.ca/e/2021-retail-… #Robotics #AI #ML @animesh_garg @angelaschoellig @domo_mr_roboto
1
12
87
Need a helping hand in the lab? Tired of manually doing tedious chemistry experiments over and over? Meet ORGANA 🤖 🧪 – a modular and user-friendly robotic lab assistant that can interact, plan and automate a number of chemistry experiments!
3
16
84
15,080
This work is accepted at @NeurIPSConf #NeurIPS2020 Come talk to us: Wed Dec 09 09:00 AM -- 11:00 AM (PST) @ Poster Session 3 #874 Paper and Talk at nips.cc/virtual/2020/public/… Code is now out as well github.com/pairlab/v-cdn
Learning Causal Graphs that capture Physical Systems has high potential yet challenging! Check out End-to-End Causal Discovery from videos Site: yunzhuli.github.io/V-CDN/ Paper: arxiv.org/abs/2007.00631 w\ @YunzhuLiYZ @AnimaAnandkumar, A.Torralba, D. Fox

ALT Discovery of a Causal Graph from a Single Video Instance of a Deformable Object (a Shirt in this case)

2
16
84
Imitation learning frameworks are often with 2D inputs. but 2D limits generalization even to camera poses. This has been an ongoing challenge, especially for humanoids since the camera pose is not steady and need not match the training data. @albertwilcoxiii has been working on the idea of operating with canonical 3D representations instead. However doing this naively doesnt always work since most of the data is not easily featurized in 3D. Our effort in Adapt3R proposes a solution which generalizes to wildly varying camera poses and even to new robot embodiments.
3
8
84
6,341
Call for Papers for #Neurips2022 is out! 📅 May 19, 2022 1PM PDT @SGhalebikesabi & I will be Comms Chairs managing @NeurIPSConf among other things! This is going to be a fun year! neurips.cc/Conferences/2022/…

ALT Spongebob Busy GIF

3
9
83
Are you interested in emerging questions in perception for manipulation? Consider our workshop on "Visual Learning and Reasoning for Robotic Manipulation" at #RSS2020 @RoboticsSciSys. Work-in-progress ideas equally welcome as are polished ones! buff.ly/3c69djC
11
83
This is a moment to celebrate for the long standing foundational contributions to RL by Sutton and Barto Congrats to them and all their collaborators over the years who made this long line of advancements possible
Meet the recipients of the 2024 ACM A.M. Turing Award, Andrew G. Barto and Richard S. Sutton! They are recognized for developing the conceptual and algorithmic foundations of reinforcement learning. Please join us in congratulating the two recipients! bit.ly/4hpdsbD
4
80
4,735
Accelerated Policy Learning with Parallel Differentiable Simulation at #ICLR2022 Project: short-horizon-actor-critic.g… Paper: arxiv.org/abs/2204.07137 Poster: Tue Apr 26 6:30 PM PDT iclr.cc/Conferences/2022/Sch… read more 🧵👇
2
14
75
Interested in Exploration in RL with image based inputs? Self-supervised goal reaching requires committed exploration for long-horizon tasks. LEAF: Latent Exploration Along the Frontier with reachability estimation Paper: buff.ly/31FqVbI @florian_shkurti @therealhomanga
2
13
78
Grasping with multi-finger hands is so hard! Sadly, sampling-based data scaling doesn't work 😭 Our @eccvconf paper presents Grasp'D! Diff Sim for exploring in full grasp space with surface contacts. It works for both robot and human hands without imitation data Thread 👇
Grasp'D: Differentiable Contact-rich Grasp Synthesis for Multi-fingered Hands at #ECCV2022 Project: graspd-eccv22.github.io/ Paper: arxiv.org/abs/2208.12250 Video: piped.video/watch?v=tkl_lywZ… [Thread below]
1
10
72
Must feel like end of an era at FAIR with Ross (@inkynumbers) going to AI2 and Kaiming going to MIT Both Ross and Kaiming have taught the community the need for good systems and simplicity in research, especially at scale.
Incredibly excited to announce that Ross Girshick (@inkynumbers) will be joining the PRIOR team @allen_ai ! Ross is one of the most influential and impactful researchers in AI. I'm so honored that he is joining us, and I'm really looking forward to working with him.
1
5
78
27,119
Long term reasoning needs an understanding ofcontinuous changes in the world. "What happens if I open the door" Action Concept Grounding Networks learn these semantics arxiv.org/abs/2011.11201 iclr-acgn.github.io/ACGN/ @GnosisYu W.Chen @SMEasterbrook @UofTRobotics @VectorInst
3
14
78
Is data going to solve robotics? I was asked to ponder this at ICRA 2025 Keynote this year alongside Daniela Rus, Russ Tedrake, Aude Billard, Leslie Kaelbling and Frank Park. Turns out that this is a surprisingly polarizing, especially in robotics. As I built my arguments for the debate, I have refined the ideas in 4 key lessons. 1. Too much structure hurts! We have seen it time and again that engineered inductive biases may not be competitive as data & compute scale. The argument to building a complex solution from a model-first approach suffers that same challenge of picking the correct hypothesis space! 2. Data helps with Ambiguity & Robustness The real world is far-too ambiguous to specify in terms of clean objective! Moreover, achieving generalization in terms of robustness to variability even within task-family has long been a challenge to model-based methods. 3. Data leads to a Unifying Perspective! Robotics has been a community of communities! Despite the nuances, we are still excited by the same problem - enabling intelligent behavior in a physical environment. Yet robotics community has deep fissures across subareas. Soft-inductive biases as well as newer AI perspectives have united many communities across Sciences. Roboticists can turn over a new leaf by adopting learning based tools to unite the variety of problems and representations across the domain. 4. Data-First ⇏ Lack of Modularity Adopting a perspective that scales Data/Compute does not eschew modular abstractions. The fabric of computing has been built on abstractions. Robotics is the next generation of computing. There is no reason to believe that the abstractions will be discarded altogether, yet a new paradigm is needed as each layer is now guided by a learning based technique. Scaling Data appears to be necessary, but insufficient to "solve" robotics problems! Yet, data will take us very far in our quest for intelligence, and then we will turn scientists to "study" our creation to understand and extend it Slides: animesh.garg.tech/assets/pdf…
1
5
78
8,061
This is mind-blowing that this patent application is even possible and passed internal checks at @GoogleAI. Neither would this truly be novel nor unique. And yet just because someone else did not patent this, someone tries to patent not an idea but an entire field 🤣 🤦🏻‍♂️
Google files patent “Deep Reinforcement Learning for Robotic Manipulation” patentimages.storage.googlea…
5
21
73
great work from @leto__jean & team. Data is very expensive and imitating from humans directly is a great idea. We attempted a similar strategy for inpainting in the LBW-KP paper pair.toronto.edu/lbw-kp/ Of course the capability of diffusion based models has improved much since, and creating crisper outcomes is easier including control-net style background replacements. It is definitely worthwhile to push it to the limit of current generative models.
Introducing Phantom 👻: a method to train robot policies without collecting any robot data — using only human video demonstrations. Phantom turns human videos into "robot" demonstrations, making it significantly easier to scale up and diversify robotics data. 🧵1/9
2
7
75
9,349
Learning Keypoints for robust state representations. "Unsupervised Disentanglement of Pose, Appearance, and Background from Images and Videos" Paper: arxiv.org/abs/2001.09518 Code: github.com/NVIDIA/Unsupervis… @aysegl_dndr, Kevin Shih, @abstract_ai, @drewtao, @ctnzr @NvidiaAI
19
76
Great research, great food, and you get to stay after graduation! most tech jobs that one may want exist in Toronto. (3rd after SF and NYC) students shopping for grad school and postdocs should take note 😉 @CadeMetz does a profile in @nytimes nytimes.com/2022/03/21/techn…
2
4
76
We are hiring new students in robotics in various area at the newly formed robotics cluster. @florian_shkurti @BurgnerKahrs @igilitschenski @UofTRobotics @UofTCompSci And btw we have a farm of (real) robots (>20), in case you are looking to play during grad school
Passionate about Math 🧮, CS 💻, AI 🧠, and Robots 🦾? This season, I am looking for my first cohort of graduate students 🎓 at the fantastic @UofTCompSci 🇨🇦 ❤️💻. Find out more at web.cs.toronto.edu/graduate/… and apply by Dec 1st.
1
10
73
Isaac Gym blog @NVIDIAAI news.developer.nvidia.com/in… Using just one A100 GPU, Isaac Gym achieves same perf. in ~10 hours — as compared to 30 hours on 6000+ CPUs. A single GPU outperforming an entire cluster by a factor of 3x Reel of current envs in IG: piped.video/lyaoewyXAxM
Isaac Gym - @NVIDIAAI physics simulation environment for reinforcement learning research (preview Release) - End-to-End GPU accelerated - Isaac Gym tensor-based APIs fo massively parallel sim buff.ly/3myqp6O Also get in touch for potential internships to flex in Gym!
2
5
70
Our new paper on hierarchical framework that combines model-based control and RL for learning robust quadruped controllers Paper: arxiv.org/abs/2009.10019 Video: piped.video/watch?v=JJOmFZKp… X. Da @zhaomingxie @HoellerDavid B. Boots @AnimaAnandkumar @yukez @BuckBabich at @NVIDIAAI
Researchers at NVIDIA have built a robot that automatically adapts to different terrains helping delivery robots and other autonomous machines function more effectively in environments. See this technology in action. #GTC20
2
18
73
ha! i was (and still am) advised by @drfeifei could not have asked for a better postdoc experience.
1
1
73
Using LLMs for code generation allows a very intuitive and effective way to perform feedback guided multistage planning for robotics. Come chat with me and @Ishika_S_ about ProgPrompt on Thurs Jun 1st - Poster Hall, 3-4.40pm, Pod 10 at #ICRA2023 progprompt.github.io/
Super excited to share our work ProgPromt! We show how LLMs can be used for situated robot planning by prompting them with pythonic code. abs: arxiv.org/abs/2209.11302 project page: progprompt.github.io [1/9]
19
72
14,321
Many schools have dropped (or made optional) the GRE requirement -- a step in the right direction. btw GRE is optional at @UofTCompSci as well! It is an undue burden for non-native speakers to memorize words far too recondite for prosaic usage!

ALT What Do You Mean Kid GIF

Replying to @paul_pearce
UC Berkeley MIT Stanford CMU UIUC U. of Washington Cornell Georgia Tech Princeton UT Austin Michigan Wisconsin UCSD Harvard UMD UPenn Purdue UMass Amherst NYU NEU UChicago And I hear more are coming. I'll add more if people reply.
6
5
73
Interested in Robotics Engineering in research? We are hiring an Engineering Tech in Robotics to help with robotics education content + open-sourcing robotics frameworks + robot learning algorithms! jobs.utoronto.ca/job/Mississ… DM for info & retweet🙏 @UofTCompSci @UofTRobotics @UTM
3
20
72
Simulation is the data factory for robotics. Yet, we seem to only use it for scale! Scale is not all you need!, or atleast not the only ingredient. Algorithmic innovation matters🛠️ So what is beyond vectorized physics? I provide a perspective on using additional information from your simulator in a talk at @nvidia #GTC24 today on Embracing contacts: Learning Control with Differentiable Simulation. Over the course of last 2 years in collaboration with @NVIDIARobotics , we have found that adding first order information enables problems which are otherwise not solvable easily by high-dimensional sampling (such as grasp-synthesis dexterous hands) and exploration (such as PPO for functional loco-manipulation) The Humanoids of the future will need to use hands, but there policies are surprisingly hard to learn. We believe using First-order methods in RL enables an elegant solution to the optimization problem with the need for either human engineered guidance. These are essential building blocks we need towards the cross-functional moonshot efforts to build a Foundational model for Humanoids, both at @NVIDIAAI (Project GR00T) and elsewhere. We have a body of work in this space, that I discuss briefly in this talk. 1. Grasp'D Grasp Synthesis with DiffSim dexgrasp.github.io/ 2. Fast-Grasp'D & Grasp'D-1M Xpbd solver and large scale dexterous grasping dataset. fast-graspd.github.io/ 3. HandyPriors: Differentiable Priors for physics aware agent-object interaction sites.google.com/view/handyp… 4. SHAC: Short-Horizon actor Critic RL with DiffSim that scales and matches PPO short-horizon-actor-critic.g… 5. AHAC: Adaptive Horizon Actor-Critic Improve SHAC for adaptive automatic truncation to mitigate gradient quality in simulation. This is now both faster and asymptotically better than PPO, even if PPO is given 100x more data!🚀 adaptive-horizon-actor-criti… Slides from the talk today dropbox.com/scl/fi/2jgxkc9ch… Check it out and stay tuned for the video! Session details: nvidia.com/gtc/session-catal…
11
72
7,554
I am here at #GTC2025 this week. I will be speaking at Driving AI-Powered Industry Automation at 4pm Tue 3/18. Come chat with me about Robotics, Simulation and AI. DM me for meet up.
1
5
72
4,245
I'm giving a talks today at #ICLR2025 ! - Representations for Embodied FMs (Robot Learning Workshop @ 4:45 pm) robot-learning.ml/2025/ Will talk about - anyplace (any-place.github.io/) - adapt3r (pair.toronto.edu/Adapt3R/) - PWM (imgeorgiev.com/pwm/) among others. There is also an interesting panel on conceptions of generalization and benchmarking in robot manipulation.
3
9
71
11,770
very timely observation. I have been arguing for 3+ years that robot learning is inherently a systems challenge. Not to argue that robotics doesnt benefit from algorithmic advances, but the marginal gain from systems is way higher. And this has only magnified as we see increased non-academic activity in the space.
A large part of Robot Learning is systems research, but it doesn't quite fit the existing paradigms: think of CoRL, RSS, SOSP, and MLSys. It draws from all but is inherently full-stack --- hardware, control, compute, ml algos. There are nice works here and there but we likely need something more structured to gather and distribute principles & ideas.
5
5
70
13,573
The path to autonomy is paved with simulation. Today, we present ORBIT-Surgical to move the needle in the right direction for surgical robotics ⚕️🏥🥼 Come talk to us: Th 230 pm JST in NT G-301 at #ICRA2024 I have worked in Surgical Subtask automation for 10+ years, and despite the utility, progress in this niche of robot learning has been slower than other domains. The lack of easy to use high fidelity simulation has been a continued handicap in bringing new ideas in robot learning into medical robotics. Orbit-Surgical is An Open-Simulation Framework for Learning Surgical Augmented Dexterity It provides an unified simulation for multiple hardware platforms (dvrk, STAR) and various surgically relevant tasks (14 envs). It enables both research in perception as well as decision making (RL as well as imitation). Importantly, it is easy to build on and extend! 👉 GPU-Optimized Reinforcement Learning: Unleash the potential with interfaces that accelerate the training of reinforcement learning policies on a single Nvidia GPU. Witness the swift mastery of tasks involving both rigid and soft objects, achieving results in mere hours! 👉 Teleoperation Excellence: Immerse yourself in a myriad of interfaces, featuring the dVRK manipulator, and seamlessly teleoperate the digital twin in real-time, hence leveraging real-world demonstrations for policy learning in simulation. 👉 Photorealistic Synthetic Data: Experience diverse sensor modalities in simulation – from high-fidelity synthetic RGBD images to semantic segmentation. The fusion of photorealistic synthetic and real-world images supercharges surgical tool segmentation models. You can now read the paper and play with the code today: orbit-surgical.github.io/ This is open source 💪, and we welcome contributions from the community 🫂 @nvidia writes a great blog summarizing this work blogs.nvidia.com/blog/orbit-… This is great collaboration across institutions @NVIDIARobotics @ICatGT @berkeley_ai @UofTCompSci @leggedrobotics @MasoudMoghani Qinxi Yu @KDharmarajan123, Vincent Schorp, William Panitch, @JasonJZLiu Kush Hari, Raven Huang, @mayankm155 @Ken_Goldberg
11
69
8,476
Looking forward to interacting with friends from MILA, and talking about recent work on my series on Generalizable Autonomy Part- I was last year at MIT Deep Learning Claas. piped.video/8Kn4Gi8iSYQ Hopefully this would be a worthy part-II same old problem, all new methods!
The second speaker in the robot learning seminar series is @animesh_garg from @UofTRobotics Tune in at 4 pm ET tomorrow (5 Feb) piped.video/watch?v=HLYenPrC… Seminar series schedule: montrealrobotics.ca/robotlea… nitter.app/MontrealRobots/status/… @Mila_Quebec
1
11
67
Come talk to us about SlotDiffusion - an object-centric Latent Diffusion Model (LDM) designed for both image and video data 🗓️ Wed, Dec 13, 10:45 📌 Poster #611 Hall B1+B2 (level 1) The students couldn't be here but the advisors (@igilitschenski) are equally fun to chat with!
1
10
68
9,689
A fresh look at low cost mobile manipulation which simplifies the mechanism get seems to achieve a broad set of tasks Ex-Googler's Startup Comes Out of Stealth With Beautifully Simple, Clever Robot Design. spectrum.ieee.org/automaton/…
7
69