Adding a head-mounted camera is #1 requested features for UMI, but “just adding a camera” is harder than it looks🙃 First, just adding it actually hurts performance (a lot!) due to a much bigger embodiment gap -- the camera now sees more of the body, human neck motion differs from robots, and even different operator heights can now introduce inconsistencies -- all issues UMI was originally designed to bypass. So, how can we bridge this gap and actually benefit from the extra sensing? Check out @xiaomeng’s post on HoMMI 🧵👇
Can we learn whole-body mobile manipulation directly from human demonstrations? Introducing Whole-Body Mobile Manipulation Interface (HoMMI) Egocentric + UMI, 0 teleop -> bimanual & whole-body manipulation, long-horizon navigation, active perception hommi-robot.github.io
5
10
100
20,427
It took a lot of trust to let @Zhenjia_Xu and @zhou_xian_ deploy this backflip policy on our physical robot 😉 -- really impressed by the result in the end! so excited about this simulator and what it can do for future robots🤖!
Everything you love about generative models — now powered by real physics! Announcing the Genesis project — after a 24-month large-scale research collaboration involving over 20 research labs — a generative physics engine able to generate 4D dynamical worlds powered by a physics simulation platform designed for general-purpose robotics and physical AI applications. Genesis's physics engine is developed in pure Python, while being 10-80x faster than existing GPU-accelerated stacks like Isaac Gym and MJX. It delivers a simulation speed ~430,000 faster than in real-time, and takes only 26 seconds to train a robotic locomotion policy transferrable to the real world on a single RTX4090 (see tutorial: genesis-world.readthedocs.io…). The Genesis physics engine and simulation platform is fully open source at github.com/Genesis-Embodied-…. We'll gradually roll out access to our generative framework in the near future. Genesis implements a unified simulation framework all from scratch, integrating a wide spectrum of state-of-the-art physics solvers, allowing simulation of the whole physical world in a virtual realm with the highest realism. We aim to build a universal data engine that leverages an upper-level generative framework to autonomously create physical worlds, together with various modes of data, including environments, camera motions, robotic task proposals, reward functions, robot policies, character motions, fully interactive 3D scenes, open-world articulated assets, and more, aiming towards fully automated data generation for robotics, physical AI and other applications. Open Source Code: github.com/Genesis-Embodied-… Project webpage: genesis-embodied-ai.github.i… Documentation: genesis-world.readthedocs.io… 1/n
10
41
489
57,426
The Internet is too fast, I’m still crafting my catchy twits, and word is already out😂 Well then, now you have it: RoboNinja🥷: Learning an Adaptive Cutting Policy for Multi-Material Objects roboninja.cs.columbia.edu/ 🧵👇 for a few interesting details you might have missed
4
46
350
69,592
How to precisely swing an *unknown* rope to hit a target? It is a challenging task even for us due to complex system dynamics - introduced by object deformation and high-speed dynamic actions. Iterative Residule Policy (irp.cs.columbia.edu) is our attempt, details🧵⬇️1/n
3
41
313
We recently launched umi-data.github.io as a community-driven effort to pool UMI-related data together. 🦾 If you are using a UMI-like system, please consider adding your data here. 🤩🤝 No dataset is too small; small data WILL add up!📈
4
40
247
55,335
Check out UMI! 3 things I learned in this project: 1. Wrist-mount cameras can be sufficient for challenging manipulation tasks with the right hardware design. 2. Cross-embodiment policy is possible with the right policy interface. 3. BC can generalize if the data is right.
Can we collect robot data without any robots? Introducing Universal Manipulation Interface (UMI) An open-source $400 system from @Stanford designed to democratize robot data collection 0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)
3
31
238
64,024
Interesting name “Harmonic Reasoning” — Indeed, orchestrating asynchronous, multi-frequency, continuous-time streams of sensing and action is what makes robot learning challenging and unique. It’s great to see more learning architectures designed specifically to address these properties, rather than simply relying on a network built for other domains, such as vision or language.
Introducing GEN-0, our latest 10B+ foundation model for robots ⏱️ built on Harmonic Reasoning, new architecture that can think & act seamlessly 📈 strong scaling laws: more pretraining & model size = better 🌍 unprecedented corpus of 270,000+ hrs of dexterous data Read more 👇
8
22
229
21,957
More robots do not always lead to higher productivity if they don’t collaborate ;) Check out our latest work multiarm.cs.columbia.edu in #CORL2020. Despite being trained on 1-4 arms static task, the system generalizes to 5-10 arms with dynamic targets w/ Huy Ha, Jingxi xu
4
44
209
UMI got the Outstanding System Paper finalist #RSS2024. Congratulations team!! 🥳 Hope to see more UMI running around the world 😊 !
Check out UMI! 3 things I learned in this project: 1. Wrist-mount cameras can be sufficient for challenging manipulation tasks with the right hardware design. 2. Cross-embodiment policy is possible with the right policy interface. 3. BC can generalize if the data is right.
6
8
195
31,987
DextAIRity: Deformable Manipulation Can be a Breeze!#RSS2022 A different way to manipulate objects using controlled airflow that reaches beyond contact 🤖 w. @Zhenjia_Xu, @chichengcc, @Ben_Burchfiel ,@eacousineau, Siyuan Feng @CAIR_lab +@ToyotaResearch 🦾
5
53
189
always enjoy reading Seohong's paper, they are incredibly clear and thought-provoking -- Seohong needs to write more :)
Q-learning is not yet scalable seohong.me/blog/q-learning-i… I wrote a blog post about my thoughts on scalable RL algorithms. To be clear, I'm still highly optimistic about off-policy RL and Q-learning! I just think we haven't found the right solution yet (the post discusses why).
2
9
181
22,589
Congratulations to Huy for winning the best system paper at CORL @corl_conf. It was so much fun building FlingBot and seeing the system work :)
9
8
177
Honored to be selected as a Sloan Fellow. I am grateful to my mentors, collaborators, and most importantly my awesome students!! Thank you all!
The @SloanFoundation picked five @Columbia scientists for research fellowships this year. Congrats @jcolinhill, @GKaragiorgi, @KaczaLab, @SongShuran + @henryquantum! bit.ly/33pMFfg @CUSEAS @ColumbiaQuantum @ColumbiaCompSci @NevisLabs
24
3
167
Still deciding where to place the robot camera? 🤔 Why not everywhere?! 👀 Meet RoboPanoptes – the All-Seeing Robot with Whole-Body Dexterity! 🤖✨ With 21 cameras 📷distributed across its body, RoboPanoptes sees and operates from every angle. And yes, its name is inspired by Argos Panoptes, the many-eyed giant of Greek mythology! 👁️ 😉
Can robots leverage their entire body to sense and interact with their environment, rather than just relying on a centralized camera and end-effector? Introducing RoboPanoptes, a robot system that achieves whole-body dexterity through whole-body vision. robopanoptes.github.io/
3
14
161
15,290
Diffusion Policy for robots! The most impressive thing to me is how fast we can deploy a new skill with this framework -- and we just keep adding more and more. Cheng has made the framework really easy to use, so you try it out too. Colab & Github: diffusion-policy.cs.columbia…
What if the form of visuomotor policy has been the bottleneck for robotic manipulation all along? Diffusion Policy achieves 46.9% improvement vs prior StoA on 11 tasks from 4 benchmarks + 4 real world tasks! (1/7) website : diffusion-policy.cs.columbia… paper: arxiv.org/abs/2303.04137
22
157
27,850
Universal Manipulation Policy Network – a single policy learns to manipulate a diverse set of articulated objects (e.g., fridge, laptop, or drawers) regardless of their joint types or # links. ump-net.cs.columbia.edu w. @zhenjia @zhanpeng_he Things we learned 🧵⬇️1/n
3
29
158
Montessori busy boards for robots! We're open-sourcing a toy-inspired robot learning environment for developing essential interaction, reasoning, and planning skills. Let's give our robot toddlers toys to play with before asking them for help in the kitchen ;) (1/n)
2
21
157
Semantic abstraction -- give CLIP new 3D reasoning capabilities, so your robots can find that “ rapid test behind the Harry Potter book.” 😉 w. Huy Ha
Semantic Abstraction: Open-World 3D Scene Understanding from 2D Vision-Language Models abs: arxiv.org/abs/2207.11514 project page: semantic-abstraction.cs.colu…
28
157
🚀 Meet ToddlerBot 🤖– the adorable, low-cost, open-source humanoid anyone can build, use, and repair! We’re making everything open-source & hope to see more Toddys out there!
Time to democratize humanoid robots! Introducing ToddlerBot, a low-cost ($6K), open-source humanoid for robotics and AI research. Watch two ToddlerBots seamlessly chain their loco-manipulation skills to collaborate in tidying up after a toy session. toddlerbot.github.io/
6
16
155
11,452
Congratulations to @chichengcc for winning the Best Paper Award and @Zhenjia_Xu for the Best Systems Paper Finalist at #RSS2022 !! 🥳🎉
Big congrats to the winners (and the finalists!) of the #RSS2022 awards: - Best paper award: roboticsconference.org/progr… - Best systems paper award: roboticsconference.org/progr… - Best student paper award: roboticsconference.org/progr…
11
8
141
Honored to be a Microsoft Research Faculty Fellow!
Congratulations to the 2021 Microsoft Research Faculty Fellows! This fellowship recognizes innovative, promising new faculty whose exceptional talent for innovation identifies them as emerging leaders in their fields. Learn about their research interests: aka.ms/AAcu80l
4
145
Excited to receive the NSF CAREER award! I'm grateful to all my students @CAIRLab, mentors, and collaborators for making this possible 😊 and thank you, Holly and Bernadette, for writing this nice article that summarizes our research. 🤖
Congrats to our @ColumbiaCompSci Prof Shuran Song @SongShuran, who's won an @NSF CAREER award to enable #Robots to learn on their own and adapt to new environments. bit.ly/3smzRj2 @ColumbiaScience @Columbia
10
13
143
Latent Policy Barrier: A simple and practical way to make any pretrained diffusion policy more robust— ✅ No human correction data ✅ No reward functions ✅ No model finetuning LPB uses a latent dynamics model (trained on unlabeled rollout data) to keep the policy execution in-distribution — effectively creating a "safety barrier" around the original policy. More details 👇
How to prevent behavior cloning policies from drifting OOD on long horizon manipulation tasks? Check out Latent Policy Barrier (LPB), a plug-and-play test-time optimization method that keeps BC policies in-distribution with no extra demo or fine-tuning: project-latentpolicybarrier.…
2
18
143
15,178
Embodiment is such a critical component of Embodiment Intelligence but often gets overlooked. Can robots learn to generate different embodiment (i.e., hardware designs) for different tasks that drastically simplify perception, planning, and control? Check it out ⬇️
Can we automate task-specific mechanical design without task-specific training? Introducing Dynamics-Guided Diffusion Model for Robot Manipulator Design, a data-driven framework for generating manipulator geometry designs for given manipulation tasks. w. Huy Ha, @SongShuran
1
14
135
19,777
There’s something satisfying to see the robot slotting in the box flaps so nicely in the end ... 😌
Today we're excited to share a glimpse of what we're building at Generalist. As a first step towards our mission of making general-purpose robots a reality, we're pushing the frontiers of what end-to-end AI models can achieve in the real world. Here's a preview of our early results in autonomous general-purpose dexterous capabilities – fast, reactive, smooth, precise, bi-manual coordinated sensorimotor control.
1
7
131
13,532
Have to share these epic fails ... "We've broken 3 legs, fried 1 Jetson, and ripped one pair of pants, so you don't have to" 😅 Check here for details: umi-on-legs.github.io/😉
2
11
137
15,650
Dynamic manipulation turns out to be so much more effective for cloth unfolding! Check out FlingBot -- unfold your shirt in 3 steps! 😉 Code for both simulation & real robots is available! flingbot.cs.columbia.edu #CORL2021 w/ Huy Ha
6
20
132
Real2Code -- translating real-world articulated objects to sim using code generation! With the code representation, this method scales well wrt the number of object parts, check out the 10-drawer table it reconstructed 😉
Here’s something you didn’t know LLMs can do – reconstruct articulated objects! Introducing Real2Code – our new real2sim approach that scalably reconstructs complex, multi-part articulated objects. arxiv.org/abs/2406.08474
1
16
133
21,465
Finally! 🚀 TidyBot++ is out!! An open-source mobile manipulator that you can build yourself. 🛠️ Check out the website and Jimmy's post for more 👇
When will robots help us with our household chores? TidyBot++ brings us closer to that future. Our new open-source mobile manipulator makes it more accessible and practical to do robot learning research outside the lab, in real homes!
1
14
131
16,140
Don't want to collect hundreds of demonstrations for every object and scenario? Check out EquiBot form @yjy0625 --- Leveraging equivariance in diffusion policy to make it sample-efficient and generalizable!
Want a robot that learns household tasks by watching you? EquiBot is a ✨ generalizable and 🚰 data-efficient method for visuomotor policy learning, robust to changes in object shapes, lighting, and scene makeup, even from just 5 mins of human videos. 🧵↓
17
128
17,254
Nice design for dexterous hand data collection, that has zero-embodiment gap!!
How do we unlock the full dexterity of robot hands with data, even beyond what teleoperation can achieve? DEXOP captures natural human manipulation with full-hand tactile & proprio sensing, plus direct force feedback to users, without needing a robot👉dex-op.github.io/
4
5
126
14,313
By plugging a $5 contact microphone 🎤into UMI, we can now "hear" 👂all the critical contact events during manipulation and "feel" ☝️the subtle differences on the contact surface. Check out @Liu_Zeyi_ 's new work on ManiWav: Manipulation from In-the-Wild Audio-Visual Data!
🔊 Audio signals contain rich information about daily interactions. Can our robots learn from videos with sound? Introducing ManiWAV, a robotic system that learns contact-rich manipulation skills from in-the-wild audio-visual data. See thread for more details (1/4) 👇
1
15
120
11,779
Meet the newest member of the UMI family: DexUMI! Designed for intuitive data collection — and it fixes a few things the original UMI couldn’t handle: 🖐️ Supports multi-finger dexterous hands — tested on both under- and fully-actuated types 🧂 Records tactile info — it can tell if you're picking up salt 📍 More robust tracking — powered by ARKit (thanks, Apple 👀) Downside? It’s a bit less portable than OG UMI — all those new sensors and gears does come at a cost 😅 Still more work to do, but we're excited!
Can we collect robot dexterous hand data directly with human hand? Introducing DexUMI: 0 teleoperation and 0 re-targeting dexterous hand data collection system → autonomously complete precise, long-horizon and contact-rich tasks Project Page: dex-umi.github.io
6
19
129
27,945
Video generation is getting crazily good 🤯 — but when using them as robot policies, they are still way too slow or imprecise 😔 Unified Video Action Model (UVA) makes it a bit more practical. ⬇️ 😊 If you are searching for a better robot policy🤖, a world model🌏, or an inverse dynamics model🦾, check it out. UAV can do them all!
Video generation is powerful but too slow for real-world robotic tasks. How can we enable both video and action generation while ensuring real-time policy inference? Check out our work on the Unified Video Action Model (UVA) to find out! unified-video-action-model.g… (1/7)
3
15
115
14,061
;)
@SongShuran summarized the dynamics in many academic labs so well 😂
3
114
13,436
This demo says it all! Holonomic, high payload, easy to use, and sooo much FUN! 😉
Replying to @jimmyyhwu
TidyBot++ is "batteries included" We've open sourced everything: • Hardware design • Low-level controller • Phone teleoperation interface • Policy learning pipeline Project page: tidybot2.github.io Docs: tidybot2.github.io/docs Get started building your own TidyBot++ today!
4
8
103
17,769
UMI is really "taking off" and flying!!! ✈️😊
✈️🤖 What if an embodiment-agnostic visuomotor policy could adapt to diverse robot embodiments at inference with no fine-tuning? Introducing UMI-on-Air, a framework that brings embodiment-aware guidance to diffusion policies for precise, contact-rich aerial manipulation.
13
103
12,447
Struggling with your 2D visual predictive models that keep losing track of objects? Time to try out this 3D dynamic scene representation (DSR) dsr-net.cs.columbia.edu at #CORL2020. w. zhenjia_xu @zhanpeng_he @jiajunwu_cs
1
23
102
What makes a robot hand design better at learning from human demonstrations? Is it being similar in size to a human hand, or matching its degrees of freedom? DexMachina lets us explore this question in simulation — and the results are quite interesting! Check it out 😉
How to learn dexterous manipulation for any robot hand from a single human demonstration? Check out DexMachina, our new RL algorithm that learns long-horizon, bimanual dexterous policies for a variety of dexterous hands, articulated objects, and complex motions.
6
104
7,071
One of the common questions I get for UMI is how to apply it to mobile robots, eps when we don't have a precise IK solver. Check out UMI-on-legs! With a manipulation-centric whole-body controller, we can put any UMI skills on a legged robot🐕 Video: piped.video/4Bp0q3xHTxE
I’ve been training dogs since middle school. It’s about time I train robot dogs too 😛 Introducing, UMI on Legs, an approach for scaling manipulation skills on robot dogs🐶It can toss, push heavy weights, and make your ~existing~ visuo-motor policies mobile!
7
99
15,674
We are training ToddlerBot like a real toddler 👶—guiding and supporting it as it learns 🦾, so it can master real-world skills safely and effectively!
How do we learn motor skills directly in the real world? Think about learning to ride a bike—parents might be there to give you hands-on guidance.🚲 Can we apply this same idea to robots? Introducing Robot-Trains-Robot (RTR): a new framework for real-world humanoid learning.
3
9
96
8,666
Position control can only go so far. For contact-rich tasks, robots must master both position and force – that’s where compliance comes in! But what’s the right compliance? 🤔Hint: being always compliant in all directions won’t cut it. Check out @YifanHou2’s solution 😉⤵️
Can robots learn to manipulate with both care and precision? Introducing Adaptive Compliance Policy, a framework to dynamically adjust robot compliance both spatially and temporally for given manipulation tasks from human demonstrations. Full detail at adaptive-compliance.github.i…
1
9
93
14,034
Arrived in ICRA!! Tomorrow, I give a talk in the ICRA HandyMoves workshop @ 8:40 am ☕️ sites.google.com/view/dexter… and Acoustic Sensing and Representations for Robotics in the after @ 4:00 pm 🍵 sites.google.com/view/roboac… Sleep early and see you soon! 🌅
1
3
95
6,488
This robot is having a lot of fun! Check out @ruoshi_liu's PaperBot, a robot that learns to design, fold, and throw a paper airplane 😊✈️, and many other things!
Humans can design tools to solve various real-world tasks, and so should embodied agents. We introduce PaperBot, a framework for learning to create and utilize paper-based tools directly in the real world. paperbot.cs.columbia.edu/
1
10
87
19,419
New group photo. Halloween Edition 👻
1
7
82
20,833
The tastiest robot demo 🤩 !!
Introducing 𝐌𝐨𝐛𝐢𝐥𝐞 𝐀𝐋𝐎𝐇𝐀🏄 -- Hardware! A low-cost, open-source, mobile manipulator. One of the most high-effort projects in my past 5yrs! Not possible without co-lead @zipengfu and @chelseabfinn. At the end, what's better than cooking yourself a meal with the 🤖🧑‍🍳
1
8
81
25,308
Let’s say we do have a powerful robot model that learns rich behaviors from large-scale robot data. What now? To make them actually useful, we need to be able to steer and control their behavior so that they match user intent, preferences, or deployment needs. How can we do that without retraining or heavy sampling? Check out @du_maximilian's DynaGuide 👇
Normally, changing robot policy behavior means changing its weights or relying on a goal-conditioned policy. What if there was another way? Check out DynaGuide, a novel policy steering approach that works on any pretrained diffusion policy. dynaguide.github.io/ 🧵
3
81
7,969
🚀 I have seen hundreds of diffusion policy roll-outs in my lab, have to say @YifanHou2's ACP is clearly one of the most graceful and robust policies I have seen without training on a ton of data, just make good use of compliance.
Adaptive Compliance Policy just won the best paper award at the ICRA Contact-Rich Manipulation workshop! Huge thanks to the team and everyone who supported us at the workshop. adaptive-compliance.github.i… contact-rich.github.io/
2
5
79
5,831
@chichengcc May I get a pre-order link??? 😉
After 18 months in stealth, dozens of prototypes, millions of real-home demonstrations, and one final all-nighter, we’re thrilled for you to say hello to Memo
73
14,569
Dance with the robots!🥰 not fight with them ...
It's honestly been such a dream of mine to combine my two passions: dancing and robotics.
2
6
75
13,112
Dive into AquaBot 🤖! Here are things I learned: 1⃣ Teleop underwater robots is really hard 😫 and oftentimes not optimal! That’s why we’re pushing AquaBot to self-improve beyond its training data. 2⃣ The underwater world is incredibly dynamic 🌊 hence, a more reactive strategy is often better. This means shorter action horizons and smaller networks — definitely different from our typical design choices!
We present🌊AquaBot🤖: a fully autonomous underwater manipulation system powered by visuomotor policies that can continue to improve through self-learning to perform tasks including object grasping, garbage sorting, and rescue retrieval. aquabot.cs.columbia.edu more details👇
3
67
12,170
have been waiting for this release! Robotics needs rigorous and careful evaluation now more than ever 🦾
TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.gith… One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the technology, and to share a lot of details for how we're achieving it. piped.video/watch?v=BEXFnru5…
1
5
66
6,848
Thank you, Deepak! I'm honored to be in such great company :)
Replying to @pathak2206
Also congratulations to @SongShuran -- delighted to be joining the list with her. :) technologyreview.com/innovat…
1
2
66
12,108
What if your robot hand suddenly lost a finger? 🤕🤖 Wouldn’t it be great if the same policy could still be effective? Check out "Get-Zero"— by representing the embodiment as a directed grasp, the single trained policy can generalize across new designs without retraining 🪄
What if you could control new hand designs without a new policy? Introducing GET-Zero, an embodiment-aware policy that can zero-shot control a wide range of hand designs with a single set of network weights. get-zero-paper.github.io
8
67
9,097
Happy Valentines Day from Aquabot 💝
Happy Valentines Day! 🌹 Enjoy a special Valentine's day themed policy (sound on!) from the AquaBot team 👬❤️🦾 Visit aquabot.cs.columbia.edu/ to learn more about our recent ICRA publication!
2
1
56
5,371
Teleoperating a robot feels unnatural — not just because of limited arm or hand DoFs, but also because of the lack of perceptual freedom! Humans naturally move their head and torso to search, track, and focus — far beyond a simple 2-DoF camera. How to get there? Check out Vision in Action (ViA) -- it learns these active perception strategies from human demos-- and it’s simple enough to add on your robot too 🤖✨
Your bimanual manipulators might need a Robot Neck 🤖🦒 Introducing Vision in Action: Learning Active Perception from Human Demonstrations ViA learns task-specific, active perceptual strategies—such as searching, tracking, and focusing—directly from human demos, enabling robust visuomotor policies under visual occlusions. 🧵👇
1
3
65
6,107
Looking for a good action representation that is transferable across embodiment 🦾and domain 🤖? Object flow 🔀 could be a good choice! Check out @mengdaxu_1031's Im2Flow2Action to see why 😉
Can robots acquire real-world manipulation skills without real-world robot training data? Introducing Im2Flow2Act: — a framework that leverages object flow to bridge human video and simulation data, enabling scalable skill acquisition. Project Website: im-flow-act.github.io/
6
60
13,220
UMI's pretrained weight is released. We have tested the policy on three different robots: UR5, Franka, and ARX. Time to try it on your robot !! Buy any "espresso cup with saucer" on Amazon, and it should work -- or let @chichengcc know if it doesn't 😉
Weights drop ⚠️ We released our pre-trained model for the cup arrangement task trained on 1400 demos! We aim to enable anyone to deploy UMI on their robot to arrange any "espresso cup with saucer" they buy on Amazon. github.com/real-stanford/uni…
1
3
60
8,837
It will be a lot of fun working with Ruoshi!
Everyone says they want general-purpose robots. We actually mean it — and we’ll make it weird, creative, and fun along the way 😎 Recruiting PhD students to work on Computer Vision and Robotics @umdcs for Fall 2026 in the beautiful city of Washington DC!
1
55
12,309
Amazing work on collaborative cooking. The interaction between human and robots is so natural and smooth, see the subtle things like how the robot is pausing and waiting for the human to pour soup, very impressive!
Cooking in kitchens is fun. BUT doing it collaboratively with two robots is even more satisfying! We introduce MOSAIC, a modular framework that coordinates multiple robots to closely collaborate and cook with humans via natural language interaction and a repository of skills.
1
4
54
6,700
Check out @chichengcc's step-by-step tutorial on building the UMI gripper. We really hope to see more UMIs running in the wild. 😊
We made a step-by-step video tutorial for building the UMI gripper! Please leave comments on @YouTube if you have any question piped.video/x3ko0v_xwpg
2
50
7,011
Really impressive work using UMI as the data collection interface! Excited to see the scaling law demonstrated here. 🦾
🤖 How can robot policies zero-shot generalize to any new environment and any new object? Introducing our new project: 🚀Data Scaling Laws in Imitation Learning for Robotic Manipulation🚀—bringing us closer to the dream of having robots work as waiters in hot pot restaurants! 🍲
1
6
50
5,551
One grasping policy for many and (new!) grippers. Code is available here: [adagrasp.cs.columbia.edu/](adagrasp.cs.columbia.edu/). Try it out, and let us know if your favorite gripper is missing! w. Zhenjia, Beichun, @submagr
3
4
48
This is so cool 🤯! Imagine pairing this robot hardware platform with generative hardware design (like the one from @XiaomengXu11 @haqhuy 👉 dgdm-robot.github.io/), we can really get customized hardware for any object or task almost instantly.
ICRA2025で主著1件を発表します。 形状と固さを比較的自由に変更出来るグリッパーを提案しました。 ご興味ある方はぜひ。 Shunya Hara, Osamu Fukuda, Mitsuru Higashimori, “Juzu Type Gripper That Can Change Both Shape and Firmness”.
6
48
5,464
#RSS2022 is happening next week @Columbia! @chichengcc and @Zhenjia_Xu are presenting Iterative Residual Policy irp.cs.columbia.edu and DextAIRity dextairity.cs.columbia.edu Join us for a tour of our lab on Thursday! Our robots are getting dressed for demos 😜
6
43
long-awaited Toddy update!
ToddlerBot 2.0 is released🥳! Now Toddy can also do cartwheels🤸! We have added so many features since our first release in February; see github.com/hshi74/toddlerbot… for more details. Threads🧵(1/n)
1
6
46
4,979
@haqhuy’s new project: *Scaling up* robot data collection using LLM for ✅ task decomposition ✅ reward formulation *Distill down* into visuomotor policies that ✅ operate from raw sensory input ✅ improve overtime. Check out the engaging Q&A here 😉 cs.columbia.edu/~huy/scaling…
How can we put robotics on the same scaling trend as large language models while not compromising on rich low-level manipulation and control?
6
45
5,353
Also, forget to mention, UMI is always evolving! If you're adding new sensors or making hardware tweaks, please share it as well! 🙌 Even when the data is not directly transferable to the current UMI, it can still power pretraining or other creative applications. 🚀
We recently launched umi-data.github.io as a community-driven effort to pool UMI-related data together. 🦾 If you are using a UMI-like system, please consider adding your data here. 🤩🤝 No dataset is too small; small data WILL add up!📈
1
2
43
4,355
haha 😂😂 @andyzeng_
Andy Zeng's trademark toss.
1
43
4,782
it is so cool, we need more robots and data "in the wild" 🦾!
Introducing DexWild -- a scalable approach to diverse "in the wild" data collection for dexterous robotic hands! This data can be used to co-train policy for any downstream robotic hands on any body form factor (humanoids, AMR with arms, etc). 🚀🤖
2
3
40
5,964
The visualization looks so nice and high quality!! The webpage looks like a startup 😉
Introducing Mobi-π: Mobilizing Your Robot Learning Policy. Our method: ✈️ enables flexible mobile skill chaining 🪶 without requiring additional policy training data 🏠 while scaling to unseen scenes 🧵↓
3
40
4,217
Just like us humans, failures are inevitable for robots as well and it is important to "REFLECT" on them! Check out @Liu_Zeyi_ and @ArpitBahety's new project on failure reasonings for robots. The new dataset (RoboFail) and code are out too! robot-reflect.github.io
🤖 Can robots reason about their mistakes by reflecting on past experiences? (1/n) We introduce REFLECT, a framework that leverages Large Language Models for robot failure explanation and correction, based on a summary of multi-sensory data. See below for details and links👇
2
5
38
13,780
Replying to @chichengcc
Applied and eagerly waiting! I need one in the home, one in the lab 😛 Our lab is so messy lately, need some help 😂
2
36
2,303
Thank you @CSProfKGD!!
Really cool invited talk by @SongShuran - “Making Video Model Useful for Robots”
1
30
5,769
Manipulation is not just about the hand; it is a whole-body activity 🐕
Excited to share our new work ReLIC, a framework for versatile loco-manipulation through flexible interlimb coordination. We combine reinforcement learning and model-based control to let robots dynamically assign limbs 🦾🦿 for manipulation or locomotion based on task demands.
1
34
4,309
The talk and poster session for FlingBot is tomorrow (8 am in California, 11 am in Boston, 4 pm in London, 1 am in Tokyo). Please drop by and say hi!
Congratulations to #CoRL2021 best systems paper finalist, "FlingBot: The Unreasonable Effectiveness of Dynamic Manipulation for Cloth Unfolding", Huy Ha, Shuran Song. openreview.net/forum?id=0QJe… #robotics #learning #award #research
1
30
Can robots learn how to improve their tools (i.e., grippers) to better accomplish a given task? Check out our work “Fit2Form: 3D Generative Model for Robot Gripper Form Design.” at #CORL2020 fit2form.cs.columbia.edu w. Huy Ha, @submagr
1
9
29
dense 3D tracking for deformables 👗
Deformable objects are common in household, industrial and healthcare settings. Tracking them would unlock many applications in robotics, gen-AI, and AR. How? Check out MD-Splatting: a method for dense 3D tracking and dynamic novel view synthesis on deformable cloths. 1/6🧵
1
2
23
13,867
TRI's effort on Scaling up Diffusion Polices!
Hats off to @ToyotaResearch for exciting results with “Large Behavior Models” and Diffusion Policies: piped.video/watch?v=w-CGSQAO…
26
12,304
cross-campus, cross-embodiment generalization ✈️
Spent just over an hour collecting UMI data in Stanford, sent it to @hgupt3 in CMU to run drone I've never seen before, and it just works 🤯 I was expecting to iterate on the data a few more times but embodiment guidance is so robust! umi-on-air.github.io/ to learn more!
1
3
25
5,971
robot MOO!!
🚨 🚨 Another new work showcasing bitter lesson 2.0 🚨 🚨 Introducing MOO: robot-moo.github.io We leverage vision-language models (VLMs) to allow robots to manipulate objects they've never interacted with, and in new environments, while learning end-to-end policies. 🧵
24
4,013
sweet demo 🍬🍭🥳 @andyzengtweets
From our demo floor at AI@, check out Code as Policies at work. This helper robot is able to compute and execute a task given via natural language. Read more → goo.gle/3U5CmCg
23
#CVPR2022 We are looking volunteers from the CVPR community (graduate students, university faculty, and researchers) to help us organize *in-person* outreach events!
#CVPR2022 call for volunteers is now up! cvpr2022.thecvf.com/call-vol…

ALT Ill Do It 30 Rock GIF by PeacockTV

1
3
21
With the help of the teacher (UR5 arm), Toddy learns to swing and have fun!! 😇
Replying to @hkz222
RTR is not just for adaptation. We also designed a challenging "swing-up" task, training the policy entirely from scratch in the real world. The humanoid learned this motion with the help and reward from the robot arm in less than 20 minutes.
1
21
3,535
5/5 yes, the failure cases are delicious 🍑🥑🥭
2
2
17
3,755
It is hollywood-level demo. so cool!
Excited to share our latest progress on legged manipulation with humanoids. We created a VR interface to remote control the Draco-3 robot 🤖, which cooks ramen for hungry graduate students at night. We can't wait for the day it will help us at home in the real world! #humanoid
1
19
5,058
Apart from swing rope, IRP is a general formulation that could work for other dynamic manipulation of deformable objects, like swinging a table cloth. 5/n
2
16
1/5 We have developed a differentiable simulator based on Taichi for multi-material object cutting. It helped in saving us a lot of avocados 😉
1
1
15
2,031
Code & Data for Semantic Abstraction is out! semantic-abstraction.cs.colu… We also hosted a HuggingFace demo for you to try out our multi-scale CLIP relevancy extractor: huggingface.co/spaces/huy-ha…
Semantic abstraction -- give CLIP new 3D reasoning capabilities, so your robots can find that “ rapid test behind the Harry Potter book.” 😉 w. Huy Ha
4
16
4/5 It is an amazing team work from 5(!) different universities!! Thank you all @Zhenjia_Xu , Zhou Xian, @Xingyu2017, @chichengcc, @huang_zhiao, @gan_chuang
1
1
13
3,755
also, check out @yihuai's documentation on how to run the UMI-on-Legs system on physical hardware!
Replying to @haqhuy
We spent a lot of effort on the documentation and hope that people can easily reproduce our work (including hardware!). We disscussed our hardware choices and how we fixed all kinds of harware problems so you don't have to. Please check it out! github.com/real-stanford/umi…
15
3,172
The deadline for #RSS2022 Workshops & Tutorials is approaching (Feb 18)! Remember to submit your proposal. 🤖 roboticsconference.org/infor…
2
14
Looking forward to it 🤖
TOMORROW: Spring 2022 GRASP SFI: Shuran Song, Shuran Song,(@SongShuran ) Columbia University, “The Reasonable Effectiveness of Dynamic Manipulation for Deformable Objects” 3/16 @ 3:00 - 4:00pm - Levine 512 & Zoom. See you there! grasp.upenn.edu/events/sprin…
13
Cool!
RL gets specific to the robot it is trained on. Can a policy be trained to control many agents? Turns out, training (shared) policy for each motor instead of whole robot not only achieves SOTA at train but also transfers to unseen agents w/o fine-tuning! huangwl18.github.io/modular-…
2
1
12
Stress test on robustness-- we interrupt the system by randomly tying a few knots on the rope after the policy converges on a given goal. Thanks to its iterative formulation, IRP can quickly adapt and regain good performance. 4/n
1
11
Replying to @ramkumarkoppu
Indeed, most biological systems don't have full-body vision (just full-body touch). But we don't need to be limited to mimicking nature! Cameras give us much richer info about the environment, and they’re also much easier to get than tactile sensors these days!
2
10
826