Associate Professor @UTCompSci | Director @NVIDIAAI Co-Leading GEAR | CS PhD @Stanford | Building generalist robot autonomy in the wild | Opinions are my own

Austin, TX
Got a taste of @Tesla's FSD v12.3.4 last night. By no means flawless, but the human-like driving maneuvers (with no interventions) delivered a magical experience. Excited to witness the recipe of scaling law and data flywheel for full autonomy show signs of life in real products. The future of end-to-end robot learning is bright. Big thanks to @charles_rqi for the thrilling test rides!
26
186
1,746
467,874
The game of tenure-track faculty job: ℍ𝕒𝕣𝕕 𝕞𝕠𝕕𝕖: 1st year ℍ𝕖𝕝𝕝 𝕞𝕠𝕕𝕖: 1st year + COVID-19 𝕀𝕟𝕗𝕖𝕣𝕟𝕠 𝕞𝕠𝕕𝕖: 1st year + COVID-19 + No Power/Internet in freezing Texas P.S. It has been great fun to play. What's next?
24
20
913
Proud to see our latest progress on Project GR00T featured in Jensen's #SIGGRAPH2024 keynote talk today! We integrated our RoboCasa and MimicGen works into NVIDIA Omniverse and Isaac, enabling model training across the Data Pyramid from real-robot data to large-scale simulations.
13
55
397
47,215
People who are really serious about robot learning should make their own robot hardware.
24
40
597
86,074
The million-dollar question in humanoid robotics is: Can humanoids tap into Internet-scale training data such as online videos due to their human-like physique? Our #CoRL2024 oral paper showed the promise of humanoids learning new skills from single video demonstrations. (1/n)
13
105
553
69,580
My Robot Learning class @UTCompSci is updated with the latest advances and trends, such as implicit representations, attention architectures, offline RL, human-in-the-loop, and synthetic data for AI. All materials will be public. Enjoy! #RobotLearning cs.utexas.edu/~yukez/cs391r_…
13
102
528
New work: we built a meta-learning algorithm for an agent to discover the causal and effect relations from its visual observations and to use such causal knowledge to perform goal-directed tasks. Paper: arxiv.org/abs/1910.01751 Joint work w/ @SurajNair_1 @drfeifei @silviocinguetta
5
104
446
Excited to announce RoboCasa, a large-scale simulation framework of everyday tasks! We use generative AI tools to create diverse objects, scenes, and tasks. Simulation plays a pivotal role in our Data Pyramid for training generalist robots. Open-source at robocasa.ai
14
91
450
141,502
📢Update announced in today’s #GTC2024 Keynote📢 We are working on Project GR00T, a general-purpose foundation model for humanoid robots. GR00T will enable the robots to follow natural language instructions and learn new skills from human videos and demonstrations. Generalist robots need a versatile body and an intelligent mind. The NVIDIA GEAR group, led by @DrJimFan and me, is working closely with partner robotics companies to develop robot foundation models, with the goal of deploying millions of general-purpose robots to real-world tasks. Reach out if you want to join us on this mission!
11
57
412
212,120
Heard students say WFH lowers productivity. In 1665, a Cambridge college student had to WFH during a pandemic. He got away from professors and worked on math alone. When he returned, the world knew him as Issac Newton! Good time to think hard in pajamas. washingtonpost.com/history/2…
8
110
393
Thrilled to co-lead this new team with my long-time collaborator @DrJimFan. We are on a mission to build transformative breakthroughs in the landscape of Robotics and Embodied Agents. Come join us and shape the future together!
Career update: I am co-founding a new research group called "GEAR" at NVIDIA, with my long-time friend and collaborator Prof. @yukez. GEAR stands for Generalist Embodied Agent Research. We believe in a future where every machine that moves will be autonomous, and robots and simulated agents will be as ubiquitous as iPhones. We are building the Foundation Agent — a generally capable AI that learns to act skillfully in many worlds, virtual and real. 2024 is the Year of Robotics, the Year of Gaming AI, and the Year of Simulation. We are setting out on a moon-landing mission, and getting there will spin off mountains of learnings and breakthroughs. Join us on the journey: research.nvidia.com/labs/gea…
16
18
299
63,936
Life update: I will be joining @UTAustin as an Assitant Professor in @UTCompSci starting Fall 2020. I am thrilled to continue my research on robot learning and perception as a faculty and look forward to collaborating with the exceptional faculty, researchers, and students at UT.
21
13
366
Sharing the slide deck and video recording of my talk "Data Pyramid and Data Flywheel for Robotic Foundation Models" at Princeton Robotics Symposium last November. I discussed the vision of training foundation models on diverse data sources and refining them during deployments. 🗃️ Slide deck: rpl.cs.utexas.edu/talks/2024… 📹 Recording: ai.princeton.edu/events/robo…
11
53
362
25,171
We took a short break from robotics to build a human-level agent to play Competitive Pokémon. Partially observed. Stochastic. Long-horizon. Now mastered with Offline RL + Transformers. Our agent, trained on 475k+ human battles, hits the top 10% on Pokémon Showdown leaderboards. No search or heuristics, just sequence modeling. Today, we're open-sourcing our Metamon platform with our algorithms, data, and environments: 🌐 metamon.tech We are excited to see how our work accelerates research on building generally capable AI agents, and more importantly, inspires the next generation of Pokémon trainers!
10
63
362
50,574
Honored to receive the NSF CAREER award titled "Intelligent Manipulation in the Real World via Modularity and Abstraction" to advance our lab's research on building autonomy stack for general-purpose robot manipulation in the wild! nsf.gov/awardsearch/showAwar…
24
15
334
Thrilled to announce GR00T N1, our open foundation model for generalist humanoid robots! GR00T N1 adopts a dual-system design, leverages the entire data pyramid for model training, and supports various robot embodiments. GR00T N1 embodies years of fundamental research, spanning compositional autonomy stack, synthetic data generation, and scalable training algorithms. We have made our whitepaper, pre-trained models, training datasets, and codebase publicly available. We can't wait to see how our efforts accelerate the future development of humanoid robotics! 🌐 Codebase: github.com/NVIDIA/Isaac-GR00… 🧩 Tech blog: developer.nvidia.com/blog/ac… 📃 Whitepaper: research.nvidia.com/publicat…
8
59
325
28,697
Excited to share our latest progress on legged manipulation with humanoids. We created a VR interface to remote control the Draco-3 robot 🤖, which cooks ramen for hungry graduate students at night. We can't wait for the day it will help us at home in the real world! #humanoid
5
63
309
68,297
Releasing my Stanford Ph.D. dissertation and talk slides "Closing the Perception-Action Loop: Towards Building General-Purpose Robot Autonomy", a summary of my work on robot perception and control @StanfordSVL Slides: stanford.io/2o5OYiN Dissertation: stanford.io/2P7s6uo
3
52
298
Congratulations to @snasiriany and @huihan_liu on winning the #ICRA2022 Outstanding Learning Paper award for their first paper @UTCompSci “Augmenting Reinforcement Learning with Behavior Primitives for Diverse Manipulation Tasks”!
23
19
292
Sim-and-real co-training is the key technique behind GR00T's ability to learn across the data pyramid. Our latest study shows how synthetic and real-world data can be jointly leveraged to train robust, generalizable vision-based manipulation policies. 📚 co-training.github.io/
2
44
283
30,342
Today we are on the road to Austin, TX. I have a pleasant melancholy moving out of the Bay Area, a place where we have lived for seven years, leaving behind many fond memories and long-time friends. Meanwhile, thrilled to start out a new life. Tons of exiting things to come!
22
3
265
AI researchers and Pokémon fans of the world, unite! We launched the PokeAgent Challenge at @NeurIPSConf, inviting researchers to build AI Agents for competitive battles and RPG speedruns. RL, LLM, and Search methods are now climbing our leaderboards. Cash prizes available, and a hackathon this weekend! Details: pokeagent.github.io
5
30
268
23,808
We are accepting research proposals to accelerate Robotics + AI through the NVIDIA Academic Grant Program. Our Edge AI call is seeking projects on GPU simulations, learning-based control, and foundation models for humanoid robotics. Apply by March 31: nvda.ws/3ZNxzuW
5
35
256
24,775
Taught my first (online) class @UTCompSci. Super pumped to teach a grad-level Robot Learning seminar this fall. Great to see UT students from all kinds of backgrounds passionate about learning what’s going on at the forefront of AI + Robotics🤘Syllabus: cs.utexas.edu/~yukez/cs391r_…
6
20
245
Very impressed by the new @Tesla_Optimus end2end skill learning video! Our TRILL work (ut-austin-rpl.github.io/TRIL…) spills some secret sauce: 1. VR teleoperation, 2. deep imitation learning, 3. real-time whole-body control. It's all open-source! Dive in if you're into humanoids! 👾
1
51
240
50,669
Just learned that our MineDojo paper won the Outstanding Paper award at #NeurIPS2022 See you in New Orleans next week! neurips.cc/virtual/2022/awar…
Introducing MineDojo for building open-ended generalist agents! minedojo.org ✅Massive benchmark: 1000s of tasks in Minecraft ✅Open access to internet-scale knowledge base of 730K YouTube videos, 7K Wiki pages, 340K Reddit posts ✅First step towards a general agent 🧵
5
31
243
We are unlikely to create an “ImageNet for Robotics”. In retrospect, ImageNet is such a homogeneous dataset. Labeled images w/ boxes. Generalist robot models will be fueled by the Data Pyramid, blending diverse data sources from web and synthetic data to real-world experiences.
6
30
239
28,586
Some of my proudest memories of my PhD are working with people from different countries and being advised by a stellar all-women thesis committee. I encourage students from diverse backgrounds to apply for my future lab @UTCompSci where diversity and inclusion will be valued.
2
18
224
Had a blast visiting @CMU_Robotics and gave a talk at the RI Seminar today, where I briefly mentioned our new work, PLD, on self-improving VLAs. It achieved 99.2% on LIBERO and a one-hour continuous execution of GPU assembly with a 100% success rate. Check this out!
What if robots could improve themselves by learning from their own failures in the real-world? Introducing 𝗣𝗟𝗗 (𝗣𝗿𝗼𝗯𝗲, 𝗟𝗲𝗮𝗿𝗻, 𝗗𝗶𝘀𝘁𝗶𝗹𝗹) — a recipe that enables Vision-Language-Action (VLA) models to self-improve for high-precision manipulation tasks. PLD couples real-world residual reinforcement learning with standard supervised fine-tuning — letting robots discover, recover, and distill their own data flywheel. Quick 🧵
5
27
224
25,455
One of RL's most future-proof ideas is that adaptation is just a memory problem in disguise. Simple in theory, scaling is hard! Our #ICLR2024 spotlight work AMAGO shows the path to training long-context Transformer models with pure RL. Open-source here: github.com/UT-Austin-RPL/ama…
1
33
190
41,909
Excited to start my gap year in @NvidiaAI! Looking forward to a lot of new research opportunities with the brilliant minds!
7
5
184
Dear academics, check out our 6 pack!! 💪 Ok... I meant 6-PACK, our new 6DoF Pose Anchor-based Category-level Keypoint tracker, real-time tracking of novel objects without known 3D models!
We present 6-PACK, an RGB-D category-level 6D pose tracker that generalizes between instances of classes based on a set of anchors and keypoints. No 3D models required! Code+Paper: tinyurl.com/yycfq5h9 w/ Chen Wang @danfei_xu Jun Lv @cewu_lu @silviocinguetta @drfeifei @yukez
1
34
183
Just wrapped up my #CoRL2023 early-career keynote on 𝐏𝐚𝐭𝐡𝐰𝐚𝐲 𝐭𝐨 𝐆𝐞𝐧𝐞𝐫𝐚𝐥𝐢𝐬𝐭 𝐑𝐨𝐛𝐨𝐭𝐬 on Wed. In case you missed it, here's a brief summary. Check out the slide deck for more detail: rpl.cs.utexas.edu/talks/path… 🧵1/N
4
23
179
22,811
Future driverless cars will talk with each other! We introduce Coopernaut, a cooperative driving model that uses vehicle-to-vehicle (V2V) communication for robust driving in challenging traffic conditions. #CVPR2022 Paper: arxiv.org/abs/2205.02222 Project: ut-austin-rpl.github.io/Coop…
4
19
172
We are organizing a (virtual) workshop on Visual Learning and Reasoning for Robotic Manipulation at #RSS2020. We invite extended abstract submissions that address the research problems at the intersection of perception and manipulation: sites.google.com/view/rss20v…
32
169
Loved the Slow Science Manifesto (slow-science.org/). We were told, "slow down to go faster." Oh boy, this is so much easier said than done. As a young academic, seeing fellow scholars churning out dozens of papers a year, it takes guts to hit the pause button and think!
22
164
As much as I'd like to tweet positivity and focus on #AcademicChatter, I know how difficult this moment is for the Asian community when my wife and I feel anxious about going out for shopping & errands, hearing recent news about hate crimes. Hatred is NOT a solution to a virus.
4
6
160
Implicit neural representations have pushed the envelope of 3D Vision and Graphics in recent years. How will they be useful for Robot Manipulation? Our work GIGA demonstrated that they can bridge geometry reasoning and affordance learning for 6-DoF grasping in cluttered scenes.
2
19
162
Can't wait to attend #CoRL2023 for the next two days and give an early career keynote titled "𝐏𝐚𝐭𝐡𝐰𝐚𝐲 𝐭𝐨 𝐆𝐞𝐧𝐞𝐫𝐚𝐥𝐢𝐬𝐭 𝐑𝐨𝐛𝐨𝐭𝐬: 𝐒𝐜𝐚𝐥𝐢𝐧𝐠 𝐋𝐚𝐰, 𝐃𝐚𝐭𝐚 𝐅𝐥𝐲𝐰𝐡𝐞𝐞𝐥, 𝐚𝐧𝐝 𝐇𝐮𝐦𝐚𝐧𝐥𝐢𝐤𝐞 𝐄𝐦𝐛𝐨𝐝𝐢𝐦𝐞𝐧𝐭" on Wed! corl2023.org/speakers
3
7
162
71,699
🔥robosuite updates🦾After eight months of dev effort, excited to release our v1.3 version! We integrate advanced graphics renderers with our simulation framework and provide vision APIs to bridge robot perception and decision-making research. Try it out! robosuite.ai/
2
27
159
My #CoRL2023 keynote talk on Pathway to Generalist Robots is on YouTube now: piped.video/hFPZSBnJeLc I discussed the three key ingredients for building general-purpose robot autonomy: scaling law, data flywheel, and human-like embodiment. If you want to learn more about our research agenda at @UTAustin RPL and @NVIDIAAI GEAR, this talk will give you a pretty good idea! Slide deck: yukezhu.me/talks/pathway_to_…
30
154
15,437
Thanks Fei-Fei @drfeifei for being such an amazing advisor, mentor, role model, and friend! Finishing a PhD is the end of the beginning. And greater things have yet to come!
Very proud of my PhD student @yukez for passing his PhD thesis defense with flying colors! His work is on perception, learning and robotics. Thank you thesis committee members @leto__jean @EmmaBrunskill @silviocinguetta & Dan Yamins!
4
2
152
Research used randomized controlled trial to show "tweeting improves citations". What improves the long-term impact of a paper? annalsthoracicsurgery.org/ar…
8
28
143
First time attending @HumanoidsConf (on the @UTAustin campus!) Feels pumped to see the lightning-fast progress in this space. I expect this community to proliferate in the next few years --- Generalist robot intelligence can't be achieved without general-purpose hardware!
1
9
148
9,932
Our DexMimicGen work on automated robotic data generation is officially accepted at #ICRA2025! We released our simulations, datasets, and training scripts for reproducibility. Check them out here: github.com/NVlabs/dexmimicge…
How can we scale up humanoid data acquisition with minimal human effort? Introducing DexMimicGen, a large-scale automated data generation system that synthesizes trajectories from a few human demonstrations for humanoid robots with dexterous hands. (1/n)
2
20
148
15,128
Dexterous hands have been the Achilles' Heel of humanoid robots. A pair of reliable, sturdy, and low-cost hands would make robot learning 10-100x easier. The hands' morphologies and mechanics are a big part of the algorithm for reaching human-level dexterity. Give me a hand!
15
7
145
46,034
I won't be at NeurIPS next week. But our team is seeking interns to work on exciting and ambitious new projects on Large Language Models for Agents (starting early next year). Please fill out the Application Form below if you're interested.
1
7
55
28,488
📢Release note📢 We are pleased to release *robosuite* v1.4 and migrate its backend to @DeepMind's MuJoCo binding for long-term support and feature extensibility, solidifying our commitment to building open-source research software. Try it out at robosuite.ai
2
19
139
We are releasing our #ICCV2019 work on goal-directed visual navigation. We introduced a method that harnesses different perception skills based on situational awareness. It makes a robot reach its goals more robustly and efficiently in new environments. arxiv.org/abs/1908.09073
2
32
138
100% agreed! I also felt extremely lucky to have some kindest and smartest advisors @Stanford and colleagues @UTCompSci "We're all smart. Distinguish yourself by being kind." This quote is one of the first principles I will teach to my students as a scholar.
2
13
130
Excited to share our new work, MimicDroid 🤖: Learning from human play videos gives rise to few-shot, in-context learning capabilities for humanoid manipulation.
Intelligent humanoids should have the ability to quickly adapt to new tasks by observing humans Why is such adaptability important? 🌍 Real-world diversity is hard to fully capture in advance 🧠 Adaptability is central to natural intelligence We present MimicDroid 👇 🌐 ut-austin-rpl.github.io/Mimi…
3
18
136
18,444
Check out a new blog post of our work on long-horizon planning for robot manipulation. We also released RoboVat, our learning framework that unifies #BulletPhysics simulation and Sawyer robot control interfaces. Sim2real has never been easier. github.com/StanfordVL/robova…
How can a robot solve complex sequential problems? In our newest blog post, @KuanFang introduces CAVIN, an algorithm that hierarchically generate plans in learned latent spaces. ai.stanford.edu/blog/cavin/
1
21
126
robosuite (robosuite.ai/) has been a true labor of love for the past seven years. Building this open-source simulation framework has required massive collaboration across institutions. Open-source software often goes underappreciated in academic culture. robosuite was never published, but it means so much to me. A senior faculty member once advised me (an assistant prof) to avoid engineering projects that wouldn’t "beef up the publication records." But building robosuite for the research community felt like the right thing to do and was worth every bit of my time and energy. I am incredibly grateful to be surrounded by students and collaborators who share this passion and have dedicated themselves to making open-source robot learning tools available for all. Kudos to @yifengzhu_ut, @snasiriany, @AjayMandlekar, @linkevin0, @abhihjoshi, @josiah_is_wong, @RobobertoMM and many other contributors over the years!
2
18
130
9,759
Another key result from my lab in leveraging human-centered data sources for humanoid robots — this time, human motion captures. By training on large-scale mocap databases and remapping human motions to humanoids, Harmon enables the robots to generate motions from text commands.
Excited to share our #CoRL2024 paper on humanoid motion generation! Combining human motion priors with VLM feedback, Harmon generates natural, expressive, and text-aligned humanoid motion from freeform text descriptions. 👇(1/4)
3
16
122
16,177
ICML deadline tonight, RSS deadline tomorrow, and CVPR rebuttals due next Monday. For researchers working on robot learning and perception, life is goooood 😌

ALT Good GIF

3
3
120
Roomba builds a static map of your home by moving around. Can a robot create articulated models of indoor scenes through its physical interaction? Ditto in the House builds digital twins of articulated objects in everyday environments. #ICRA2023 Website: ut-austin-rpl.github.io/Hous…
16
124
19,449
Our robot can now make morning coffee for you… The secret recipe: 1⃣ Object-centric representation 2⃣ Transformer-based policy architecture 3⃣ Data-efficient imitation learning algorithm 4⃣ Robust impedance controller. Enjoy ☕️! #CoRL2022 #VIOLA
1
11
122
Before the coronavirus outbreak, I almost decided to name my new lab VIRAL, which stands for Visual Intelligence & Robot Autonomy Lab. Now I have to change it 😅 Epidemics make us think harder.
5
1
111
Visiting @Princeton today to speak at the Symposium on Safe Deployment of Foundation Models in Robotics. Fall is a beautiful season to see the Princeton campus! Event website: ai.princeton.edu/princeton-s…
5
5
122
10,970
The research collaboration between NVIDIA GEAR and 1X has been a blast! I am amazed by how quiet, compliant, and friendly the Neo robot is. I look forward to seeing more incredible things we will build together. 1x.tech/discover/1X-NVIDIA-R…
1
11
113
8,571
Check out our new survey paper on Foundation Models in Robotics!
Foundation Models in Robotics: Applications, Challenges, and the Future paper page: huggingface.co/papers/2312.0… We survey applications of pretrained foundation models in robotics. Traditional deep learning models in robotics are trained on small datasets tailored for specific tasks, which limits their adaptability across diverse applications. In contrast, foundation models pretrained on internet-scale data appear to have superior generalization capabilities, and in some instances display an emergent ability to find zero-shot solutions to problems that are not present in the training data. Foundation models may hold the potential to enhance various components of the robot autonomy stack, from perception to decision-making and control. For example, large language models can generate code or provide common sense reasoning, while vision-language models enable open-vocabulary visual recognition. However, significant open research challenges remain, particularly around the scarcity of robot-relevant training data, safety guarantees and uncertainty quantification, and real-time execution. In this survey, we study recent papers that have used or built foundation models to solve robotics problems. We explore how foundation models contribute to improving robot capabilities in the domains of perception, decision-making, and control. We discuss the challenges hindering the adoption of foundation models in robot autonomy and provide opportunities and potential pathways for future advancements. The GitHub project corresponding to this paper (Preliminary release.
3
23
112
22,336
Our #ICRA2019 paper received the Best Conference Paper Award w/ @michellearning @leto__jean @animesh_garg @drfeifei @silviocinguetta
6
5
112
Heading to @CVPR today! We are organizing a 3D Vision and Robotics workshop tomorrow with a great line-up of speakers: sites.google.com/view/cvpr20… Also, I am recruiting a postdoc on vision + robotics for my group. Come to chat with me if interested - DMs are open!
17
108
31,827
2 years ago I was shopping for a coffee machine at Target. I found a perfect Keurig not for me but for my robot: - Round tray to insert a K-cup; - Lid open/close w/ weak forces; - Coffee out w/ one button click. There's no magic. Human ingenuity is behind every robot's success.
3
9
87
39,709
MimicGen source code is now publicly available! Our system generates automated robot trajectories from a handful of human demonstrations, enabling large-scale robot learning: mimicgen.github.io/
Want to generate large-scale robot demonstrations automatically? We have released the full MimicGen code. Excited to see what the community will do with this powerful data generation tool! Code: github.com/NVlabs/mimicgen Docs: mimicgen.github.io/docs/intr…
17
114
16,918
We are organizing a #CVPR2021 Workshop on 3D Vision and Robotics to promote the cross-pollination of ideas between these two research fields. CfP is open. We look forward to your contributions! sites.google.com/view/cvpr20…
20
111
Rewriting classical robot controller with physics-informed neural network, plugging it as learnable module into data-driven autonomy stack, trained with large-scale GPU-accelerated simulation ➡️ Adaptivity & Robustness to the next level💡
How can we enable robot controllers to better adapt to changing dynamics? Idea: learn a data-driven controller implemented with physics-informed neural networks, and finetune on task-specific dynamics. Website: cremebrule.github.io/oscar-w… Paper: arxiv.org/abs/2110.00704
18
105
Excited to see Jensen share our recent progress on the N1.5 foundation model and GR00T Dreams from the NVIDIA GEAR team! Several team members and I will be at #ICRA2025. Come chat with us about building generalist robot autonomy and ways to work together on this grand mission!
Jensen just announced NVIDIA’s Isaac GR00T N1.5 and GR00T-Dreams blueprint at COMPUTEX 2025: ⦿ Isaac GR00T N1.5 is the first update to NVIDIA’s open, generalized, fully customizable foundation model for humanoid reasoning and skills. ⦿ “Human demonstrations aren’t scalable — limited by the number of hours in a day,” said Jensen. The GR00T-Dreams blueprint enables the generation of vast synthetic motion data from single images, accelerating robot behavior learning via compressed action tokens. ⦿ Synthetic training data generated by GR00T-Dreams was used to develop GR00T N1.5 in just 36 hours — what would have taken nearly three months without the blueprint. ⦿ This update significantly improves the foundation model’s success rate for common material handling and manufacturing tasks. GR00T N1.5 can be deployed on Jetson Thor, launching later this year.
6
7
108
13,454
Heading to #CoRL2025 and looking forward to sharing our latest updates on Isaac GR00T for advancing humanoid robotics!
The rise of humanoid platforms presents new opportunities and unique challenges. 🤖 Join @yukez at #CoRL2025 as he shares the latest research on robot foundation models and presents new updates with the #NVIDIAIsaac GR00T platform. Learn more 👉nvda.ws/4gdfBYY
2
6
106
13,963
Visited the University of Tokyo for the #RSS2024 area chair meeting. Paper decisions have been made and will be announced on Monday. I will attend #ICRA2024 in Yokohama next week. Looking forward to connecting with the Japanese robotics community, particularly on humanoid robots!
1
4
100
12,829
Delighted to present our recent work on hierarchical Scene Graphs for neuro-symbolic manipulation planning. We use 3D Scene Graphs as an object-centric abstraction to reason about long-horizon tasks. w/ @yifengzhu_ut, Jonathan Tremblay, Stan Birchfield arxiv.org/abs/2012.07277
15
96
Excited to share my recent talk at the Stanford Robotics Seminar on “Objects, Skills, and the Quest for Compositional Robot Autonomy” featuring projects from my first year @UTCompSci and our lab’s vision of building the next generation of autonomy stack. piped.video/watch?v=xwviwNTI…
1
11
99
We have just released our new work on 6D pose estimation from RGB-D data -- real-time inference with end-to-end deep models for real-world robot grasping and manipulation! Paper: arxiv.org/abs/1901.04780 Code: github.com/j96w/DenseFusion w/ @danfei_xu @drfeifei @silviocinguetta
3
23
96
🎉Public release🎉 Thrilled to kickstart our new embodied AI moonshot: building general-purpose open-ended agents with Internet-scale knowledge!
Introducing MineDojo for building open-ended generalist agents! minedojo.org ✅Massive benchmark: 1000s of tasks in Minecraft ✅Open access to internet-scale knowledge base of 730K YouTube videos, 7K Wiki pages, 340K Reddit posts ✅First step towards a general agent 🧵
1
9
94
Excited to share VIMA, our latest work on building generalist robot manipulation agents with multimodal prompts. Massive transformer model + unified task specification interface for the win!
We trained a transformer called VIMA that ingests *multimodal* prompt and outputs controls for a robot arm. A single agent is able to solve visual goal, one-shot imitation from video, novel concept grounding, visual constraint, etc. Strong scaling with model capacity and data!🧵
13
94
I felt fortunate to attend all four CoRL conferences in the past and served as an AC the first time. @corl_conf is hands down my favorite conference - focused Robot Learning community, high-quality (<200) papers, YouTube live stream, inclusion events. I couldn't ask for more!
3
90
Texas is a booming state for robotics research and industry. We are bringing together robotics researchers across the state this Friday at Texas Regional Robotics Symposium (TEROS) 2022. Great line-up of speakers and live steam for all talks. Join us at teros-texas.github.io!
14
89
We'll witness more and more demos of humanoid robots doing the same tasks the robotics community has mastered with simpler systems. Yet people will still be awed. It speaks more about human psychology than technology. Humanoids make sense in domains requiring social interaction.
7
3
94
11,506
We've released an updated version of ACID, our #RSS2022 paper on volumetric deformable manipulation, with real-robot experiments. ACID predicts dynamics, 3d geometry, and point-wise correspondence from partial observations. It learns to maneuver a cute teddy bear into any pose.
1
9
88
All talk recordings of our #CVPR2023 3D Vision and Robotics Workshop are now available on the YouTube playlist: piped.video/playlist?list=PL…. Check them out in case you missed the event!
Heading to @CVPR today! We are organizing a 3D Vision and Robotics workshop tomorrow with a great line-up of speakers: sites.google.com/view/cvpr20… Also, I am recruiting a postdoc on vision + robotics for my group. Come to chat with me if interested - DMs are open!
13
85
11,600
Spot-on! Top AI researchers and institutes have the magic power of pushing a research field years back, simply by publishing initial papers and inadvertently creating a vicious cycle of worthless publications. With great power comes great responsibility.
A lot of machine learning research has detached itself from solving real problems, and created their own "benchmark-islands". How does this happen? And why are researchers not escaping this pattern? A thread 🧵
8
81
Sharing the slides of my talk "Learning Keypoint Representations for Robot Manipulation" presented at the Workshop on Learning Representations for Planning and Control @IROS2019MACAU Slides: stanford.io/2QsR3kv Workshop: sites.google.com/view/iros-2…
1
18
81
We have six papers to be presented at #ICRA2021 this week, spanning the topics of imitation learning for manipulation, neuro-symbolic planning, multimodal perception, uncertainty quantification, and morphological computation. A thread /5
1
2
81
I will attend ICML in Hawaii next week to present VIMA (vimalabs.github.io/) and meet friends. Our NVIDIA team is seeking new talent for AI Agents, LLMs, and Robotics. Reach out via DMs if interested!
I'm going to ICML in Hawaii! My team pushes the research frontier in AI agents, multimodal LLMs, game AI, and robotics. If you're interested in joining NVIDIA or collaborating with me, please reach out by email! My contact info is at jimfan.me/ If applicable, please attach your CV, a paragraph of self-intro, and your research interests. I look forward to meeting you in Honolulu!
6
78
30,587
Ajay gave a great talk on our RoboTurk project #IROS2019, nominated for Best Paper on Cognitive Robotics. Large-scale real robot dataset through crowd teleportation! More information can be found at roboturk.stanford.edu
19
80
Uploading physical objects to the virtual world (metaverse) by observing and interacting with them in the real world. Exciting new work on sim2real via real2sim with articulated objects #CVPR2022 #Ditto
1
8
77
NVIDIA's Academic Grant Program is back! Submit your groundbreaking ideas on Robotics and Edge AI (humanoid robotics, foundation models, simulations, ...) and turn them into reality.
Need support for your #robotics research? We’re excited to announce that our Academic Grant Program is now accepting proposals for robotics and edge AI projects. Resources include #NVIDIAJetson Orin Nano Developer kits, and more. 👉 nvda.ws/3TpkmUC
3
3
73
10,622
robosuite v1.2 released: new sensor simulation APIs, visual/dynamics/sensor randomization for sim2real, enhanced operational space controllers, and human demonstrations! Check it out from here: github.com/ARISE-Initiative/…
6
72
Check out our new work, BUMBLE — Vision-language models (VLMs) act as the "operating system" for robots, calling perceptual and motor skills through APIs. The stronger the core VLM's capabilities, the better the robot gets at mobile manipulation.
🤖 Want your robot to grab you a drink from the kitchen downstairs? 🚀 Introducing BUMBLE: a framework to solve building-wide mobile manipulation tasks by harnessing the power of Vision-Language Models (VLMs). 👇 (1/5) 🌐 robin-lab.cs.utexas.edu/BUMB…
6
71
7,139
We’re advancing automated runtime monitoring and fleet learning with visual world models — a pivotal step toward building a data flywheel for robot learning. Kudos to @huihan_liu for spearheading the Sirius projects in my lab. Very proud of her achievements!
With the recent progress in large-scale multi-task robot training, how can we advance the real-world deployment of multi-task robot fleets? Introducing Sirius-Fleet✨, a multi-task interactive robot fleet learning framework with 𝗩𝗶𝘀𝘂𝗮𝗹 𝗪𝗼𝗿𝗹𝗱 𝗠𝗼𝗱𝗲𝗹𝘀! 🌍 #CoRL2024
3
6
69
8,090
Pleased to be invited by @SamsungUS to talk about my research on robot perception and learning. Covered our latest work on self-supervised sensorimotor learning, hierarchical planning, and cognitive learning and reasoning in the open world. Video: piped.video/vxdxIBXC9x4
1
10
73
Excited to introduce 𝚛𝚘𝚋𝚘𝚖𝚒𝚖𝚒𝚌, a new framework for Robot Learning from Demonstration. This open-source library is a sister project of 𝚛𝚘𝚋𝚘𝚜𝚞𝚒𝚝𝚎 in our ARISE Initiative. Try it out!
Robot learning from human demos is powerful yet difficult due to a lack of standardized, high-quality datasets. We present the robomimic framework: a suite of tasks, large human datasets, and policy learning algorithms. Website: arise-initiative.github.io/r… 1/
6
67
Our department @UTCompSci @UTAustin is recruiting new Robotics faculty this year. Come join us in the booming city of Austin! cs.utexas.edu/faculty/recrui…
21
68
Pleased to see our Sirius paper nominated for the Best Paper Award #RSS2023: roboticsconference.org/progr… Join our presentation in Daegu, Korea on July 11th! Exciting times ahead as our lab explores the new frontier of 𝗥𝗟𝗢𝗽𝘀 (Robot Learning + Operations) in long-term deployment.
Like the best chess players are human-AI teams (centaurs), trustworthy deployment of robot learning models needs such a partnership! Sirius is our first milestone toward Continuous Integration and Continuous Deployment (CI/CD) for robot autonomy during long-term deployments👇
3
3
69
14,025
A nice summary of our recent works on imitation learning from visual demonstration. Compositionality and abstraction are key to scaling up IL algorithms to long-horizon manipulation tasks.
What if we can teach robots to do new task just by showing them one demonstration? In our newest blog post, @deanh_tw and @danfei_xu show us three approaches that leverage compositionality to solve long-horizon one-shot imitation learning problems. ai.stanford.edu/blog/ntp-ntg…
1
13
69
Looking forward to sharing our latest progress on GPU-accelerated robotics simulation in the Isaac Gym tutorial @RoboticsSciSys 2021 next Monday.
Join us on July 12th at #RSS. This workshop will introduce the end-to-end GPU accelerated training pipeline in #NVIDIA Isaac Gym, demonstrate #robotics applications, and answer questions in breakout sessions. Register here: nvda.ws/3hzBe7y #AI #robots #nvidiaisaac
5
65
I will give a talk at #SXSW2024 on How to Train a Humanoid Robot tomorrow from 10 to 11:30 a.m. Come to check out our ramen-cooking DRACO 3 robot developed @texas_robotics and learn the technical stories behind it!
2
5
63
10,605
Our Eureka follow-up work is out!
Introducing DrEureka🎓, our latest effort pushing the frontier of robot learning using LLMs! DrEureka uses LLMs to automatically design reward functions and tune physics parameters to enable sim-to-real robot learning. DrEureka can propose effective sim-to-real configurations for several robots and tasks, and we even got a bit creative with it: Let’s make a robot dog walk and balance on a yoga ball! Check out these fun videos, and follow the thread for a deep dive!
1
6
60
10,947