Jason Ma · Apr 29, 2025 · 4:29 PM UTC

Jason Ma

Pinned Tweet

Jason Ma

@JasonMa2020

29 Apr 2025

Introducing Dynamism v1 (DYNA-1) by @DynaRobotics – the first robot foundation model built for round-the-clock, high-throughput dexterous autonomy. Here is a time-lapse video of our model autonomously folding 850+ napkins in a span of 24 hours with • 99.4% success rate — zero human intervention • 60% human throughput speed • 4.3/5 quality ratings (set by the client) A thread on our motivation, insights and results:

125

945

518,740

Jason Ma · May 3, 2024 · 4:01 PM UTC

Jason Ma

@JasonMa2020

3 May 2024

Introducing DrEureka🎓, our latest effort pushing the frontier of robot learning using LLMs! DrEureka uses LLMs to automatically design reward functions and tune physics parameters to enable sim-to-real robot learning. DrEureka can propose effective sim-to-real configurations for several robots and tasks, and we even got a bit creative with it: Let’s make a robot dog walk and balance on a yoga ball! Check out these fun videos, and follow the thread for a deep dive!

112

586

247,256

Jason Ma · Nov 7, 2024 · 8:07 PM UTC

Jason Ma

@JasonMa2020

7 Nov 2024

Excited to finally share Generative Value Learning (GVL), my @GoogleDeepMind project on extracting universal value functions from long-context VLMs via in-context learning! We discovered a simple method to generate zero-shot and few-shot values for 300+ robot tasks and 50+ datasets using SOTA VLMs like Gemini (Try out the demo on our website on your robot video today!) I worked a lot on leveraging foundation models as guidance for robots in my PhD, and to me, this result forges a new frontier in how we can use foundation models for robot learning, given its broad applicability independent of embodiment and task types. Quite excited about how we can build on this work as a community!

113

587

98,175

Jason Ma · Sep 23, 2025 · 10:47 PM UTC

Jason Ma

@JasonMa2020

23 Sep 2025

We just did World’s first on-stage autonomous demo of long-horizon dexterous VLA 🚨 No training. No setup. Performance out of the box. Live demo is hard and unpredictable, but we felt great about our model’s generalization, and it went pretty well! 💯 Zero-shot. 100% success.

507

63,623

Jason Ma · Sep 15, 2025 · 11:11 PM UTC

Jason Ma

@JasonMa2020

15 Sep 2025

We have raised $120M to accelerate our mission of building and delivering high-performance general-purpose robots to the physical world. Within one year, we have made research breakthroughs, showing that it is possible to achieve real-world reliability with large VLAs, and demonstrated commercial and deployment traction, with our DYNA-1 models running live in sites at SF, LA, and Sacramento. This is just the beginning, and I never felt more optimistic about a future where AI-powered robots can positively impact human productivity. From when I started PhD where robot policies barely worked even in highly controlled settings to now deploying DYNA robots with the confidence of out-of-box model performance, when I think about the trajectory of robotics, it’s astonishing how quickly we’ve gone from if it works in the lab to it works in the world. The next frontier isn’t about proving robots can move—it’s about proving they can reliably help in real-world environments, at scale, across industries. That’s what we’re building at @DynaRobotics The impact of this shift will be massive: - Unlocking productivity across logistics, manufacturing, and beyond. - Expanding what small teams and businesses can achieve. - Freeing humans to focus on higher-level creativity, problem-solving, and connection. The mission is bigger than any single deployment. It’s about ushering in an era where general-purpose robots are as ubiquitous and trusted as computers or smartphones. We’re just getting started, and I couldn’t be more excited for what’s ahead. Join us!🚀🤖

Dyna Robotics

@DynaRobotics

15 Sep 2025

Excited to announce that we have raised $120M in our Series A to advance the frontier of general-purpose high-performance robots. 🤖 The new funding will accelerate progress towards our mission of bringing foundation-model powered robots to everyone, everywhere. Read more 👇

427

68,289

Jason Ma · Nov 22, 2024 · 4:55 PM UTC

Jason Ma

@JasonMa2020

22 Nov 2024

I recently gave a talk at MIT's Embodied Intelligence seminar, covering some of my recent works and perspectives on "Foundation Model Supervision for Robot Learning". Recording is up on Youtube: piped.video/watch?v=JfZYtpEi… Hope you like it and find it useful!

345

29,632

Jason Ma · May 12, 2025 · 6:26 PM UTC

Jason Ma

@JasonMa2020

12 May 2025

Sharing some exciting DYNA-1 result: zero-shot environment generalization We put DYNA-1 under test in a completely different environment from our training distribution – with an entirely different background (@DynaRobotics banner) and metal table. The table has a reflective and smooth surface, creating a wildly different visual appearance as well as interaction dynamics. The model is able to proceed as usual, adeptly folding and recovering from its own mistakes. By focusing on task mastery, we achieve robust generalization out of the box

303

49,104

Jason Ma · Mar 26, 2025 · 4:09 PM UTC

Jason Ma

@JasonMa2020

26 Mar 2025

Excited to launch @DynaRobotics with a team of incredible researchers, engineers and company builders! At Dyna, our mission is to bring affordable general-purpose AI robots to real production environments.

280

30,431

Jason Ma · Oct 20, 2023 · 5:19 PM UTC

Jason Ma

@JasonMa2020

20 Oct 2023

Super excited to share Eureka, our "spin" on how to use LLMs to teach low-level dexterity skills! Eureka is an open-ended reward design agent that can write and evolve superhuman reward functions for a large suite of robots and tasks, including challenging pen spinning tricks!

Jim Fan

@DrJimFan

20 Oct 2023

Can GPT-4 teach a robot hand to do pen spinning tricks better than you do? I'm excited to announce Eureka, an open-ended agent that designs reward functions for robot dexterity at super-human level. It’s like Voyager in the space of a physics simulator API! Eureka bridges the gap between high-level reasoning (coding) and low-level motor control. It is a “hybrid-gradient architecture”: a black box, inference-only LLM instructs a white box, learnable neural network. The outer loop runs GPT-4 to refine the reward function (gradient-free), while the inner loop runs reinforcement learning to train a robot controller (gradient-based). We are able to scale up Eureka thanks to IsaacGym, a GPU-accelerated physics simulator that speeds up reality by 1000x. On a benchmark suite of 29 tasks across 10 robots, Eureka rewards outperform expert human-written ones on 83% of the tasks by 52% improvement margin on average. We are surprised that Eureka is able to learn pen spinning tricks, which are very difficult even for CGI artists to animate frame by frame! Eureka also enables a new form of in-context RLHF, which is able to incorporate a human operator’s feedback in natural language to steer and align the reward functions. It can serve as a powerful co-pilot for robot engineers to design sophisticated motor behaviors. As usual, we open-source everything! Welcome you all to check out our video gallery and try the codebase today: eureka-research.github.io/ Paper: arxiv.org/abs/2310.12931 Code: github.com/eureka-research/E… Deep dive with me: 🧵

173

49,726

Jason Ma · Oct 4, 2022 · 10:16 PM UTC

Jason Ma

@JasonMa2020

4 Oct 2022

Excited to share VIP, a self-supervised visual reward and representation pre-trained on diverse human videos! VIP’s frozen reward and rep. can solve diverse unseen robot tasks using TrajOpt, online RL, and enables real-world few-shot offline RL! sites.google.com/view/vip-rl 🧵:

164

Jason Ma · May 30, 2023 · 5:59 PM UTC

Jason Ma

@JasonMa2020

30 May 2023

Excited to share our #ICML2023 paper ✨LIV✨! Extending VIP, LIV is at once a pre-training, fine-tuning, and (zero-shot!) multi-modal reward method for (real-world!) language-conditioned robotic control. Project: penn-pal-lab.github.io/LIV Code & Model: github.com/penn-pal-lab/LIV 🧵:

159

55,047

Jason Ma · Mar 13, 2024 · 4:03 PM UTC

Jason Ma

@JasonMa2020

13 Mar 2024

Humbled to share that I was selected as an Apple Scholar in AIML PhD Fellowship! Very grateful to Apple, my advisors @dineshjayaraman @obastani as well as all my mentors and collaborators for their support! machinelearning.apple.com/up…

154

11,271

Jason Ma · Jun 24, 2025 · 10:38 PM UTC

Jason Ma

@JasonMa2020

24 Jun 2025

Giving a talk tomorrow at the Foundation Models for Interactive Robot Learning workshop (lfmrss2025.weebly.com/) at RSS! Will cover some @DynaRobotics results too! What would people like to see?

139

7,884

Jason Ma · Oct 18, 2023 · 5:39 PM UTC

Jason Ma

@JasonMa2020

18 Oct 2023

Excited to share my first paper as an "advisor" :D We show that pre-trained visual representations enable a simple, fast, no-training subgoal decomposition method for long-horizon robotic manipulation! Paper: arxiv.org/abs/2310.08581 Website: zcczhang.github.io/UVD/ (🧵1/n)

124

18,985

Jason Ma · Nov 5, 2023 · 7:05 PM UTC

Jason Ma

@JasonMa2020

5 Nov 2023

I am attending #CORL2023 and presenting two new papers at various workshops! Excited to make new friends and catch up! Please reach out if you are attending and would like to chat about anything robot learning :)

112

31,482

Jason Ma · Nov 6, 2025 · 7:34 PM UTC

Jason Ma

@JasonMa2020

6 Nov 2025

We have been stress testing our model flywheel and seeing strong results in many challenging tasks! Task 1: chopping veggie🧑‍🍳 A test of coordinated tool use that involves asymmetric task and dynamic feedback to make consistent cuts. We used just 70 trajectories to get the results you see in the video. Task 2: cup stacking 🎉 A test of high-precision control, which requires precise and delicate positioning at every step. Mistakes at any step are catastrophic! The arms we use have high control error, but the model makes up for it. These are quite distinct tasks that are difficult in different ways than some of our earlier demos like laundry/napkin folding. Very glad to see by injecting dextereous control into pre-training in an integrated way, we see substantial boost in post-training robustness and efficiency!

Dyna Robotics

@DynaRobotics

6 Nov 2025

Excited to share our latest progress on DYNA-1 pre-training! 🤖 The base model now can perform diverse, dexterous tasks (laundry folding, package sorting, …) without any post-training, even in unseen environments. This powerful base also allows extremely efficient fine-tuning to ~100% success on challenging new tasks with as little as 1 hour of data! 🤯 Watch it master two of them: cup stacking & celery chopping on repeat, no failures. 👇

130

16,995

Jason Ma · Jul 2, 2025 · 4:22 PM UTC

Jason Ma

@JasonMa2020

2 Jul 2025

It's so satisfying to watch our models just do the task on demand; human like and no downtime

Dyna Robotics

@DynaRobotics

2 Jul 2025

We have started taking DYNA-1, our dexterous robust VLA model, to conferences and showcasing it for hours on end! The model run for 3 days, 8 hours each day at #HITEC2025 3 weeks ago with 99.9% overall success rate (dropped 1 towel in day 2). No intervention, it just works :)

116

7,637

Jason Ma · Nov 5, 2024 · 4:30 PM UTC

Jason Ma

@JasonMa2020

5 Nov 2024

I am very excited about our new paper (CoRL 2024 Oral) on using LLM code generation to automate environment curricula. It is often hypothesized that intelligent motor control is driven by the need to habituate in and adapt to varied environments. While there's a beautiful literature on unsupervised env design, scaling such idea to robotics has been very difficult so far given the high-dimensional env configuration space. Instead, robotics engineers still use their domain knowledge to manually design environments and their curricula thereof. In Eurekaverse, we demonstrated possibility of automating all that using LLMs and showed a compelling use case on challenging robot parkour! @willjhliang did an amazing job leading the work, and he's presenting it this week at #CoRL2024! Make sure to check out the poster and our oral presentation!

Will Liang

@willjhliang

4 Nov 2024

Introducing Eurekaverse 🌎, a path toward training robots in infinite simulated worlds! Eurekaverse is a framework for automatic environment and curriculum design using LLMs. This iterative method creates useful environments designed to progressively challenge the policy during training, enabling the learning of complex skills. We applied it to train quadruped parkour—check out these fun videos! 👟 We’ll be presenting Eurekaverse at #CoRL2024 this week and would love to see you at our oral talk and poster! And, of course, all results, details, and code are released here: Website: eureka-research.github.io/eu… Paper: eureka-research.github.io/eu… Code: github.com/eureka-research/e… Deep dive with me… 🧵

114

12,479

Jason Ma · Sep 30, 2025 · 6:45 AM UTC

Jason Ma

@JasonMa2020

30 Sep 2025

😲 Even I am surprised sometimes for what our models are capable of

Shubodh Sai 🤖@shubodhs_ai

30 Sep 2025

@DynaRobotics successfully zero-shot folded a new @corl_conf shirt -- even with the sleeve tucked in awkwardly! Amazing stuff. @JasonMa2020

110

10,169

Jason Ma · Oct 30, 2025 · 7:22 PM UTC

Jason Ma

@JasonMa2020

30 Oct 2025

We did a fun and timely halloween experiment benchmarking our VLA models' robust reasoning capabilities! 🎃 There's a lot of interest in reasoning for VLA models, but I personally felt most tasks the community benchmark on (1) do not require meaningful reasoning capabilities, or (2) are somewhat unrealistic and do not represent tasks in real-world scenarios. So we decided to use object counting and manipulation as a real benchmark; it's quite common and realistic, but I haven't seen much work in this area. End-to-end Imitation learning would fail because of combinatorially many permutations you can ask to the robot. Our VLA model can count and follow language commands fairly robustly -- all in an end-to-end architecture without external memory modules or counting logic. The model also robustly handles external disturbances to the scene (like shuffling the candy baskets). It's a small cute experiment we did to benchmark reasoning, but it's pretty fun so thought we'd share!

Dyna Robotics

@DynaRobotics

30 Oct 2025

🎃 Halloween is coming. Our hardworking team is lining up for sweet treats, of course, served by Dynasaur! DYNA VLA model now has robust agentic reasoning capability, allowing it to serve arbitrary combinations and counts of candies! Pure imitation learning can’t work given the combinatorially many possibilities. No video edits. Uninterrupted, real-life, as always 🤖 Happy Halloween from DYNA!🍬

105

12,205

Jason Ma · Sep 30, 2025 · 1:56 AM UTC

Jason Ma

@JasonMa2020

30 Sep 2025

the best part of this is that the model is totally operating *zero-shot*; we didn't prep anything particular for corl. we just brought a robot (a new one that we never ran models on), loaded the same model i demo'd live on stage during actuate, and let it run for the entire conf

Chris Paxton

@chris_j_paxton

29 Sep 2025

Everyone keeps talking about this demo, if anyone won corl it was these guys

14,164

Jason Ma · Oct 23, 2025 · 6:05 PM UTC

Jason Ma

@JasonMa2020

23 Oct 2025

It's really unfortunate to hear the FAIR layoff news.. Meta friends: if you are interested in working on frontier real-world robotics, @DynaRobotics is hiring! Please DM me!

12,213

Jason Ma · May 18, 2025 · 2:50 PM UTC

Jason Ma

@JasonMa2020

18 May 2025

I'm attending #ICRA2025 this week! Happy to chat about Dyna and all things embodied AI. DMs are open!

9,421

Jason Ma · Apr 29, 2025 · 5:06 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

“Don’t practice until you get it right. Practice until you can’t get it wrong.” We have developed a general recipe for robust and autonomous robot foundation models for real-world applications. The linchpin in our recipe is an accurate reward model (RM) that scores every robot interaction with precision. Building on our prior research, we have delivered the first scalable foundation reward model for robotics. This model outperforms previous approaches and can reliably estimate task progress on challenging dexterity tasks, like napkin folding. This capability unlocks a host of production-critical capabilities, such as (1) autonomous exploration, (2) intentional error recovery, (3) high-quality dataset creation and curation, and much more.

22,839

Jason Ma · Apr 29, 2025 · 4:06 AM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Sharing some exciting @DynaRobotics results tomorrow! Stay tuned :D

7,307

Jason Ma · Jun 5, 2022 · 6:48 PM UTC

Jason Ma

@JasonMa2020

5 Jun 2022

Excited to start my internship at @MetaAI in their Menlo Park, CA office! I will be working with @yayitsamyzhang, @shagunsodhani, and @Vikashplus on RL and robot learning topics. If you are in the area and want to chat/hang out, please let me know!!

Jason Ma · Nov 15, 2024 · 3:40 PM UTC

Jason Ma

@JasonMa2020

15 Nov 2024

Thanks for having me! I gave a talk recently at MIT's Embodied Intelligence seminar. Recording should be up soon!

Omar Costilla Reyes, PhD @konet

14 Nov 2024

Welcome to MIT @JasonMa2020! Amazing talk and work, keep it up!

8,928

Jason Ma · Sep 4, 2025 · 5:30 PM UTC

Jason Ma

@JasonMa2020

4 Sep 2025

Thanks Chris! Dyna’s core philosophy: ship general and robust models and let everyone see. No hype, no cherry picking, just results

Chris Paxton

@chris_j_paxton

4 Sep 2025

Replying to @avizurlo

My favorite is still the Dyna Robotics 24 hours of folding video; short videos like this are too easy to cherry pick

6,735

Jason Ma · Sep 13, 2025 · 9:01 PM UTC

Jason Ma

@JasonMa2020

13 Sep 2025

Autonomous deployment in real production sites is the real litmus test of general purpose robots

This tweet is unavailable

8,245

Jason Ma · Apr 29, 2025 · 4:45 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

When we founded Dyna, we began with a first-principles question that shaped our focus: What single technical hurdle must we clear to unlock unlimited demand for robots? After talking to hundreds of customers, the answer is so clear yet so underemphasized in current robotics discourse: PERFORMANCE — high throughput and high-quality output, consistently. Delivering robot performance has been our singular north star ever since; no one wants a robot that only kind of works slowly. We have tested DYNA-1 with several distinct 24-hr trials under different natural environment variations and found DYNA-1 robustly complete 700+ napkins in all trials. My favorite part of these timelapse videos is the 6-7am period when the sunrises and the model just continues performing the task as usual!

9,347

Jason Ma · May 6, 2025 · 7:59 PM UTC

Jason Ma

@JasonMa2020

6 May 2025

Honored to be part of the cohort!

RSS Pioneers @RSSPioneers

6 May 2025

List of 33 #RSSPioneer2025 is out! Their research interests cover fundamental robot design, modelling and control, robot perception and learning, localisation and mapping, human-robot interaction, healthcare and medical robotics, and soft robots! sites.google.com/view/rsspio…

4,722

Jason Ma · Aug 28, 2025 · 9:54 PM UTC

Jason Ma

@JasonMa2020

28 Aug 2025

Our cute Dynasaurs running around the clock; no downtime, they just work🦖🤖 Kudos to my teammates who did all the work here; it's a testament to how robust our models are. I didn't have to do any work to get this up and running for days!

Dyna Robotics

@DynaRobotics

28 Aug 2025

We brought Dynasaur to The Clean Show and ran it live, 8 hours a day. Each deployment fuels our robotics foundation model — scaling data, accelerating iteration, and enabling real-world generalization. Embodied intelligence won’t come from lab demos, but from deployment-first robotics running continuously in the wild.

7,249

Jason Ma · May 13, 2024 · 8:22 AM UTC

Jason Ma

@JasonMa2020

13 May 2024

This is so impressive! I can't imagine the amount of progress we will unlock as a community with low-cost, highly capable robots. Congrats to the Unitree Team!

Unitree

@UnitreeRobotics

13 May 2024

Unitree Introducing | Unitree G1 Humanoid Agent | AI Avatar Price from $16K 🤩 Unlock unlimited sports potential(Extra large joint movement angle, 23~34 joints) Force control of dexterous hands, manipulation of all things Imitation & reinforcement learning driven #Unitree #AI

9,513

Jason Ma · May 3, 2024 · 4:10 PM UTC

Jason Ma

@JasonMa2020

3 May 2024

Learning policies in simulation and transferring to the real world (or Sim-To-Real in short) is a promising strategy for robots to learn complex skills. However, humans need to tune the simulator carefully so that the policies work robustly in the real world: this is difficult, slow, and tedious. DrEureka aims to automate this by having LLMs design crucial components of sim-to-real transfer: Reward Design and Domain Randomization. Here are a few bonus several-minute-long uncut videos of DrEureka yoga ball walking policy in-the-wild on Penn’s campus, enjoy!

8,435

Jason Ma · Jun 13, 2025 · 6:22 AM UTC

Jason Ma

@JasonMa2020

13 Jun 2025

I ll be hanging out at the LeRobot Hackathon this weekend, come say hi!

Ted Xiao

@xiao_ted

12 Jun 2025

The impact of accessible low-cost robot arms and the community that’s built up around @LeRobotHF has been so awesome to see! 🤖 🚀 I am honored to be a guest judge this weekend at the Global LeRobot hackathon’s SF location. Thanks to @BitRobotNetwork for hosting.

10,056

Jason Ma · Jun 28, 2023 · 1:15 AM UTC

Jason Ma

@JasonMa2020

28 Jun 2023

Due to popular requests, we have now uploaded our pre-trained LIV model on HuggingFace for easier downloads! This is my first time doing it, and the experience was quite smooth @_akhaliq huggingface.co/jasonyma/LIV

jasonyma/LIV · Hugging Face

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

huggingface.co

Jason Ma

@JasonMa2020

30 May 2023

20,337

Jason Ma · Nov 15, 2025 · 7:44 AM UTC

Jason Ma

@JasonMa2020

15 Nov 2025

Using LLMs to program a robot dog to play with balls.. sounds familiar ;) jokes aside, AI assisted development of robot capabilities will be the future. We are still so early

Anthropic

@AnthropicAI

12 Nov 2025

New Anthropic research: Project Fetch. We asked two teams of Anthropic researchers to program a robot dog. Neither team had any robotics expertise—but we let only one team use Claude. How did they do?

7,532

Jason Ma · May 15, 2022 · 6:24 PM UTC

Jason Ma

@JasonMa2020

15 May 2022

Delighted to announce that our work on a unified framework for offline imitation from observations and examples has been accepted to ICML 2022! #icml #ICML2022

@_akhaliq

8 Feb 2022

SMODICE: Versatile Offline Imitation Learning via State Occupancy Matching abs: arxiv.org/abs/2202.02433 project page: sites.google.com/view/smodic… A single algorithm for offline IL from observations, mismatched experts, and examples. Sota results in all settings. @JasonMa2020

Jason Ma · Apr 29, 2025 · 5:25 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

DYNA-1 now folds napkins for paying customers, and we’re unlocking more skills to ship into more commercial environments in the coming weeks & months. Mastering napkin folding won’t transform daily life, but it’s a pivotal step toward making embodied AI commercially viable. This is a dream come true. After years of PhD research aimed at making robots genuinely useful in the real world, nothing felt as close as to what we accomplished in the last few months at Dyna. As we embark on this journey, we are excited to start sharing some of our results and research more widely with the robotics community! Alongside real-world robustness, we are also pushing the boundaries of cutting-edge large-scale robot learning. Join us in building robots for the real world. We look forward to hearing from you! Check out our blog post with our results: dyna.co/research

4,206

Jason Ma · Nov 7, 2025 · 6:38 PM UTC

Jason Ma

@JasonMa2020

7 Nov 2025

Agreed; it's my first time attending a developer conference instead of an academic conference. Really got me thinking deeper about deployment-in-the-loop model iteration. Also thanks for the photo feature :D

Chris Paxton

@chris_j_paxton

7 Nov 2025

Actuate 2025 was a very cool developer conference for the new era of robotics, a short blog post: itcanthink.substack.com/p/my…

9,524

Jason Ma · Jul 25, 2023 · 6:31 PM UTC

Jason Ma

@JasonMa2020

25 Jul 2023

We are presenting LIV today at #ICML2023! Exhibit Hall 1, #827 2:00pm - 3:30pm HST The future of robotics is multi-modal, and LIV demonstrates how multi-modal value pre-training from diverse human videos can bootstrap language-conditioned robot skill learning. See you there!

Jason Ma

@JasonMa2020

30 May 2023

6,409

Jason Ma · Apr 29, 2025 · 5:08 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

A unique challenge we run into at Dyna is how can we make best use of the large amount of data autonomously collected by DYNA-1 during deployment? In continuous deployment settings, robot data does not naturally come with episodic boundaries. We have also developed an approach that can automatically segment the streaming data and provide accurate progress estimation and language labeling to enhance the model's task understanding.

6,388

Jason Ma · Sep 15, 2022 · 12:35 PM UTC

Jason Ma

@JasonMa2020

15 Sep 2022

Super excited to share that GoFAR has been accepted to #NeurIPS2022 and flagged for an award! This is a new foundation for the theory and practice of (offline) goal-conditioned RL, check it out!

@_akhaliq

8 Jun 2022

How Far I'll Go: Offline Goal-Conditioned Reinforcement Learning via f-Advantage Regression abs: arxiv.org/abs/2206.03023 project page: jasonma2016.github.io/GoFAR/

Jason Ma · Apr 29, 2025 · 4:46 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

In contrast, We find that standard recipes for training robot foundation models are insufficient for real-world PERFORMANCE. On a demanding production task like restaurant-grade napkin folding, state-of-the-art models saturate at ≈ 80% single-episode success even after hundreds of hours of domain-specific data. At that level, the chance of 30 flawless consecutive executions is (0.8)³⁰ ≈ 0.1 %—functionally zero for 24/7 operations. We observe the same failure mode in-house: after one to two hours the policy drifts into unfamiliar states and cannot self-recover. The time-lapse below shows our strongest base model collapsing despite an initially perfect run. Flashy demos hide this brittleness; sustained autonomy demands ≥ 99% step-level reliability and robust fault-recovery, not just high single-episode accuracy. So what is our approach?

7,715

Jason Ma · Sep 22, 2025 · 10:41 PM UTC

Jason Ma

@JasonMa2020

22 Sep 2025

Excited to give this talk! Also lmk if you are going to be there; DMs are open

Dyna Robotics

@DynaRobotics

22 Sep 2025

Our cofounder @JasonMa2020 speaks tomorrow at #Actuate2025: "Foundation Reward Models for Robot Learning"! If you’re around, don’t miss it! And yes, we’re HIRING across research & robotics → dyna.co/careers

7,857

Jason Ma · Dec 16, 2023 · 3:18 PM UTC

Jason Ma

@JasonMa2020

16 Dec 2023

Honored to see Eureka on this list along side many amazing works! eureka-research.github.io/

NVIDIA AI Developer

@NVIDIAAIDev

15 Dec 2023

👀 Discover the top 10 #NVIDIAresearch projects of the year. ✨ From Neuralangelo's high-fidelity neural surface reconstruction to Magic3D's text-to-3D content creation, these projects push the boundaries of innovation in #AI. nvda.ws/3RlypJr

6,308

Jason Ma · Jul 18, 2025 · 12:14 AM UTC

Jason Ma

@JasonMa2020

18 Jul 2025

Looking forward to sharing some of my thoughts on how programmatic outputs from large foundation models can accelerate robot learning!

Shao-Hua Sun @shaohua0116

17 Jul 2025

Our #ICML2025 Programmatic Representations for Agent Learning workshop will take place tomorrow, July 18th, at the West Meeting Room 301-305, exploring how programmatic representations can make agent learning more interpretable, generalizable, efficient, and safe! Come join us!

3,091

Jason Ma · Jun 13, 2024 · 10:56 PM UTC

Jason Ma

@JasonMa2020

13 Jun 2024

We (@dineshjayaraman , @akrishna42 , and I) recently went on the Economist podcast to discuss DrEureka (eureka-research.github.io/dr…) and other recent works from our lab as well as trends on robot foundation models! Give it a listen if you are interested!

The Economist

@TheEconomist

13 Jun 2024

Why are robots suddenly getting cleverer? This week on “Babbage” @alokjha explores how advances in AI are bringing about a renaissance in robotics: econ.st/3KNARoV 🎧

5,641

Jason Ma · Apr 29, 2025 · 5:10 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

By scaling our RM-in-the-loop training, DYNA-1 has leapt forward in just a few weeks: • Week 1: Base model can complete single success, but falls apart after 5 minutes • Week 2: Ran 1 hours unaided, but compounding errors make recovery impossible • Week 3: Ran 8 hours, but executed only 6-7 napkins per hour (~10 mins per fold) • Week 4: Completed our first 24-hour run—but ~200 folds at low quality and speed • Week 5: Completed 24+ hours with ~350 folds at decent production-grade quality. • Week 6: Sustained 24+ hours with ~850 folds and high production-grade quality From stop-and-go to round-the-clock excellence, our continual learning recipe drives rapid, tangible gains.

5,565

Jason Ma · Jun 20, 2025 · 11:21 PM UTC

Jason Ma

@JasonMa2020

20 Jun 2025

I am attending #RSS this week! Participating in the Pioneers workshop as well as giving a talk at the Foundation Models for Interactive Robot Learning workshop. Happy to meet up and chat if you are around!

2,711

Jason Ma · Jun 26, 2025 · 5:57 PM UTC

Jason Ma

@JasonMa2020

26 Jun 2025

Excited to be speaking at #Actuate2025!

Foxglove

@foxglove

26 Jun 2025

Robots that actually work 24/7, no babysitting, are the holy grail of robotics. @JasonMa2020 and the team at @DynaRobotics realized the standard recipes for training robot foundation models lacked real-world 'performance' — high throughput, high quality, every time. Enter DYNA-1, the robot foundation model that autonomously folded 850+ napkins in 24 hours with a 99.4% success rate—zero human intervention at 60% human speed. How? An accurate reward model (RM) that scores every robot interaction. We are excited to have Jason Ma on stage at #Actuate2025 to talk more about the breakthroughs his team is achieving at Dyna Robotics. Tickets for Actuate are available now!: hubs.li/Q03tH3N_0

3,451

Jason Ma · Jul 21, 2023 · 8:42 PM UTC

Jason Ma

@JasonMa2020

21 Jul 2023

I am attending #ICML2023 next week in Hawaii! Excited to make new friends and re-connect with old ones! Please reach out if you are attending and would like to chat about anything related to research or ML! My particular interests include foundation models, RL, and robotics!

4,369

Jason Ma · Aug 30, 2023 · 3:41 PM UTC

Jason Ma

@JasonMa2020

30 Aug 2023

We are organizing Workshop on Goal-Conditioned Reinforcement Learning (GCRL) at #NeurIPS 2023! Submission Deadline: October 4th, 2023 Website: goal-conditioned-rl.github.i…

7,127

Jason Ma · Mar 25, 2025 · 5:59 PM UTC

Jason Ma

@JasonMa2020

25 Mar 2025

It was super fun catching up with @micoolcho and @chris_j_paxton and talking about our Generative Value Learning (GVL) work on @RoboPapers! We are presenting this work next month at ICLR 2025 as a Spotlight paper!

RoboPapers

@RoboPapers

25 Mar 2025

Full episode dropping soon! Geeking out with @JasonMa2020 on generative-value-learning.gi… (Vision Language Models are In-Context Value Learners) Co-hosted by @chris_j_paxton & @micoolcho

3,633

Jason Ma · Feb 23, 2024 · 7:14 PM UTC

Jason Ma

@JasonMa2020

23 Feb 2024

Big congrats to my mentors and collaborators @DrJimFan and @yukez on the new group! Embodied AI and robotics research just kicked up a gear ;)

Jim Fan

@DrJimFan

23 Feb 2024

Career update: I am co-founding a new research group called "GEAR" at NVIDIA, with my long-time friend and collaborator Prof. @yukez. GEAR stands for Generalist Embodied Agent Research. We believe in a future where every machine that moves will be autonomous, and robots and simulated agents will be as ubiquitous as iPhones. We are building the Foundation Agent — a generally capable AI that learns to act skillfully in many worlds, virtual and real. 2024 is the Year of Robotics, the Year of Gaming AI, and the Year of Simulation. We are setting out on a moon-landing mission, and getting there will spin off mountains of learnings and breakthroughs. Join us on the journey: research.nvidia.com/labs/gea…

4,271

Jason Ma · May 3, 2024 · 4:14 PM UTC

Jason Ma

@JasonMa2020

3 May 2024

At a technical level, DrEureka, following our prior work Eureka (eureka-research.github.io/), uses LLM-guided evolutionary search to generate safety-aware reward functions in code that can be used to train policies in sim. Then, leveraging LLMs’ capability as hypothesis generators, DrEureka uses the LLM to choose (1) which physics parameters to randomize, and (2) what ranges they should be randomized over based on reward-aware physics prior (RAPP) over DR parameters. Finally, using the synthesized reward and DR parameters, it trains policies for real-world deployment.

4,920

Jason Ma · Jul 4, 2025 · 5:33 PM UTC

Jason Ma

@JasonMa2020

4 Jul 2025

happy 4th! it was really fun to play with this human robot interaction🤖🇺🇸

Dyna Robotics

@DynaRobotics

4 Jul 2025

Happy July 4th from Dyna! 🇺🇸🇺🇸

2,554

Jason Ma · Nov 7, 2024 · 8:08 PM UTC

Jason Ma

@JasonMa2020

7 Nov 2024

Replying to @JasonMa2020 @GoogleDeepMind

First, check out our project website for the paper, interactive demos, and getting your robot video labeled by GVL today! You can even listen to an AI podcast about our paper, or ask Gemini questions about our paper too! We (especially @xf1280) put in a lot of effort in getting these demos up. Let us know how you find these new ways to engage with paper! generative-value-learning.gi…

5,083

Jason Ma · Apr 29, 2025 · 5:16 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

DYNA-1 achieved an unprecedented level of robustness for robot foundation models. But at Dyna, we hold ourselves to an even higher standard: production-grade quality (grades 4 or 5 out of 5 point scale). While 98% of folds reach near-perfect quality (grade ≥3), only 75% hit our rigorous quality bar. What’s the difference? Less than ⅓ inch precision on the initial fold separates perfection (grade 5) from near-perfection (grade 3). Our customers demand perfection—not near perfection—and we deliver. Tiny differences define commercial-grade quality at Dyna. This level of precision also raises the bar of our research, as every research idea is rigorously vetted to ensure measurable and significant real-world performance improvement.

4,459

Jason Ma · Oct 12, 2023 · 1:27 AM UTC

Jason Ma

@JasonMa2020

12 Oct 2023

Excited to share some of my recent works on pre-training for robotics with the MILA community!

REAL - Robotics and Embodied AI Lab @MontrealRobots

12 Oct 2023

Hello! We have @JasonMa2020 from UPenn giving a talk at this week's robot learning seminar (Thursday 11:30am EST online). Hope to see you all there! Title: Foundation Reward Models for General Robot Skill Acquisition piped.video/@MontrealRobotic… #Robotics #MachineLearning

3,977

Jason Ma · Apr 29, 2025 · 5:14 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

Over this learning process, DYNA-1 iteratively becomes much better at handling extremely difficult and out-of-distribution situations. Napkin folding is particularly challenging because: Single-pull precision: Extracting exactly one napkin from a tall stack demands fine control and rapid feedback; otherwise the gripper drags out multiple napkins, causing misfolds and chaos (as you can see in the attached videos). Flattening: When a multi-pull leaves napkins crumpled, the policy must (1) detect that multiple sheets were removed, (2) locate corners folded inward, and (3) separate & flatten overlapped layers before refolding. All of which are nontrivial dexterous endeavors. Rapid self-recovery: Once in an out-of-distribution state, the robot must untangle the mess and resume folding fast enough to keep throughput intact. Every extra second spent on edge cases erodes throughput, so the policy needs to find the quickest remedy. DYNA-1’s ability to handle chaotic scenarios even surprised us, and is the fundamental reason why it can go on for 24-hr with 99+% completion rate. There are too many robustness snippets to list, but here are a few of our favorites:

5,107

Jason Ma · Apr 29, 2025 · 5:23 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

…and what about task generalization? By focusing on dexterity and real-world robustness, we’re seeing strong positive transfer to other tough commercial tasks, such as laundry folding and, at a client’s request, cup-filling. DYNA-1 can autonomously fold many shirts of different sizes and materials in a row and also fill ingredient cups with utmost precision. Cup-filling is perhaps the hardest “no-reset” task we’ve encountered: delicate pickup, precise placement, handover, tool use—one slip ends the run. Though not perfect yet, DYNA-1 can clear every step while our internal baselines fail to move beyond the first step consistently.

4,443

Jason Ma · Jun 6, 2023 · 3:06 PM UTC

Jason Ma

@JasonMa2020

6 Jun 2023

Thanks @_akhaliq! LIV is now on arXiv: arxiv.org/abs/2306.00958 Check it out if you are interested in the space of (RL-based) vision-language pre-training for robotics! Happy to answer any questions about the paper :)

LIV: Language-Image Representations and Rewards for Robotic Control

We present Language-Image Value learning (LIV), a unified objective for vision-language representation and reward learning from action-free videos with text annotations. Exploiting a novel...

arxiv.org

@_akhaliq

5 Jun 2023

LIV: Language-Image Representations and Rewards for Robotic Control paper page: huggingface.co/papers/2306.0… Language-Image Value (LIV) is a unified pre-training, fine-tuning, and reward learning algorithm for language-conditioned visual manipulation. LIV can perform zero-shot multi-modal reward prediction on unseen robot videos and is an effective vision-language encoder for real-world robotic control.

17,134

Jason Ma · Nov 7, 2023 · 7:54 PM UTC

Jason Ma

@JasonMa2020

7 Nov 2023

Happy to announce that UVD is announced as the best paper at the CORL LEAP Workshop!

Jason Ma

@JasonMa2020

5 Nov 2023

4,155

Jason Ma · May 3, 2024 · 4:41 PM UTC

Jason Ma

@JasonMa2020

3 May 2024

DrEureka is co-led by @willjhliang and me with collaborators from @Penn and @NVIDIA: @johnnywang_16, @sam_wang23, and our advisors @yukez @DrJimFan @obastani @dineshjayaraman Check out our project website for the paper and more videos: eureka-research.github.io/dr… Code: github.com/eureka-research/D…

3,309

Jason Ma · Feb 8, 2022 · 3:26 AM UTC

Jason Ma

@JasonMa2020

8 Feb 2022

Check out what I have been working on the past few months! SMODICE is a simple and versatile offline IL algorithm that is compatible with learning from observations, mismatched experts, and even just examples! Detailed tweet coming soon 📅

@_akhaliq

8 Feb 2022

Jason Ma · Jun 17, 2024 · 6:06 PM UTC

Jason Ma

@JasonMa2020

17 Jun 2024

Attending #CVPR2024 this week in Seattle! Looking forward to making new friends and catching up with everyone!

3,209

Jason Ma · May 22, 2023 · 3:57 PM UTC

Jason Ma

@JasonMa2020

22 May 2023

We have added example code for visualizing **animated** zero-shot VIP reward curve on robot videos! Try it out (on your own robot videos) here: github.com/facebookresearch/…

Jason Ma

@JasonMa2020

4 Oct 2022

4,778

Jason Ma · Jun 15, 2023 · 5:47 PM UTC

Jason Ma

@JasonMa2020

15 Jun 2023

Check out our #L4DC paper on learning policy-aware dynamics model for reinforcement learning! The idea is very simple: focus model learning on the current policy’s visitation distribution. We theoretically show why this is desirable and extend the dual RL paradigm to MBRL, resulting in a simple, practical algorithm, TOM!

Kausik Sivakumar @kausiksivakumar

15 Jun 2023

Excited to share our our #L4DC2023 paper that introduces "Transition Occupancy Matching"(TOM) TOM learns a dynamics model that keeps up with the improving policy, facilitating continued progress Paper 📰: arxiv.org/abs/2305.12663 Code 💻: github.com/kausiksivakumar/T… 🧵

1,992

Jason Ma · Apr 29, 2025 · 5:20 PM UTC

Jason Ma

@JasonMa2020

29 Apr 2025

Replying to @JasonMa2020 @DynaRobotics

We found that DYNA-1 can achieve zero-shot environment generalization for long-horizon dexterity. While we have seen foundation models that can generalize to new environments for simple pick-and-place skills by training on diverse environments and objects, such results remain elusive for bi-manual fine-grained dexterity. The video below shows DYNA-1 folding napkins at a customer site with no additional training, but it’s worth noting that we did observe noticeable performance loss (a topic we will be conducting more research on). With additional on-site training, DYNA-1 quickly improves and becomes adept at continuous folding at the customer site. This milestone represents a significant step towards our vision of delivering performance, out of the box.

4,356

Jason Ma · Nov 25, 2022 · 6:02 PM UTC

Jason Ma

@JasonMa2020

25 Nov 2022

I am attending #NeurIPS2022 next week to present several works on offline RL and pre-training for robotics! Would love to meet people and discuss anything, in particular, RL and robot learning topics! DM me if you are attending and want to chat or grab coffee 😀

Jason Ma · Apr 9, 2024 · 5:28 AM UTC

Jason Ma

@JasonMa2020

9 Apr 2024

We at @GRASPlab hosted @ericjang11 last week for a thought-provoking talk on Humanoid robots and 1X! The full recording is up now, check it out!

Ted Xiao

@xiao_ted

9 Apr 2024

Nice talk on the technical approach to intelligent humanoids at 1X! grasp.upenn.edu/events/sprin… As usual for @ericjang11’s spicy takes, I agree strongly with 60%, ambivalent on 20%, and disagree with 20%. Highly recommend a watch! 💯

3,950

Jason Ma · Dec 14, 2023 · 8:23 PM UTC

Jason Ma

@JasonMa2020

14 Dec 2023

Come to our workshop on Goal-Conditioned RL tomorrow at #NeurIPS2023!

Goal-Conditioned RL Workshop @gcrl_workshop

14 Dec 2023

Check out the #NeurIPS2023 workshop on Goal Conditioned Reinforcement Learning: * Tomorrow (Friday) 900 -- 1830 CST * Great speakers: @jeffclune @r_mirsky @olexandr @ybisk @SusanMurphylab1 * Program: goal-conditioned-rl.github.i…

2,645

Jason Ma · Jul 16, 2024 · 9:06 PM UTC

Jason Ma

@JasonMa2020

16 Jul 2024

I won't be at #RSS2024, but @willjhliang will be presenting our DrEureka work! Go talk to him to get the latest scoop on using foundation models for dexterous robot skill learning!

Will Liang

@willjhliang

16 Jul 2024

I'll be presenting DrEureka at @RoboticsSciSys on Thursday (July 18), 8:30-9:30 am! Happy to chat at the poster session right afterwards (Commissiekamer 2) or any time during the conference! Looking forward to making new friends! #RSS2024 eureka-research.github.io/dr…

2,539

Jason Ma · May 3, 2024 · 4:37 PM UTC

Jason Ma

@JasonMa2020

3 May 2024

This was a very fun project and we learned a lot about how to use LLMs to enable robot skill learning! There are many challenges and potential future directions. For example, how to combine DrEureka with real-world execution feedback and using vision to provide feedback on reward and DR generation. And of course, not having to use a leash in the real world would be nice, but safety was important (no robot was hurt in the demos!). To cap off, here are some failures and blooper videos :)

3,531

Jason Ma · Dec 8, 2023 · 6:28 PM UTC

Jason Ma

@JasonMa2020

8 Dec 2023

Jim has been a fantastic mentor for me! Apply if you are interested in foundation models for decision making and AI Agents!

This tweet is unavailable

9,519

Jason Ma · Dec 9, 2022 · 4:38 AM UTC

Jason Ma

@JasonMa2020

9 Dec 2022

Reward learning is a fundamental challenge in RL. In VIP, we address this by pre-training a value function on action-free human videos, and the pre-trained VIP value function can zero-shot transfer to unseen robot tasks! Find out more about VIP at the Deep RL workshop tomorrow!

Karol Hausman

@hausman_k

7 Dec 2022

Deep RL workshop @NeurIPSConf starts on Friday! Hear invited talks from @tobigerstenberg on counterfactual simulation of causal judgments, @j_foerst on opponent-shaping in games, @IMordatch on sequence modeling, @yayitsamyzhang on learning generalist agents. Don't miss out! 🧵

Jason Ma · May 18, 2024 · 11:37 PM UTC

Jason Ma

@JasonMa2020

18 May 2024

sim2real for manipulation tasks is really hard! Great to see this work come out

Yunfan Jiang

@YunfanJiang

17 May 2024

Does your sim2real robot falter at critical moments 🤯? Want to help but unsure how, all you can do is reward tuning in sim 😮‍💨? Introduce 𝐓𝐑𝐀𝐍𝐒𝐈𝐂 for manipulation sim2real. Robots learned in sim can accomplish complex tasks in real, such as furniture assembly. 🤿🧵

5,416

Jason Ma · Mar 26, 2025 · 4:09 PM UTC

Jason Ma

@JasonMa2020

26 Mar 2025

The robotics industry has received much excitement lately. However, current technical challenges significantly hinder the widespread viability of robots. Most robotic foundation models today operate at just 10-30% of human-level speeds, impacting productivity on high-speed production lines. They also lack generalization capabilities, requiring extensive data collection even for minor environmental or task variations. Additionally, robot hardware remains expensive and insufficiently durable, limiting deployments to scenarios where ROI is easily justified... And these are just a few of the issues.

1,632

Jason Ma · Sep 15, 2025 · 11:23 PM UTC

Jason Ma

@JasonMa2020

15 Sep 2025

We are hiring, and my DMs are open! dyna.co/careers

1,270

Jason Ma · Nov 7, 2024 · 8:24 PM UTC

Jason Ma

@JasonMa2020

7 Nov 2024

Replying to @JasonMa2020 @GoogleDeepMind @physical_int @tonyzzhao

I'd like to thank all my collaborators for making this a super fun and rewarding project: @JoeyHejna @ayzwah @ChuyuanFu @shahdhruv_ @jackyliang42 @drzhuoxu @SeanKirmani @sippeyxp @DannyDriess @xiao_ted @JonathanTompson @obastani @dineshjayaraman @Stacormed @tingnan1986 @DorsaSadigh @xf1280 . Many of them are currently at @corl_conf , make sure to talk to them about our paper! I am particularly grateful to @xf1280 for his mentorship and guidance throughout this project; I benefited a lot from his expertise and insights on frontier VLMs for robotics!

1,671

Jason Ma · Nov 22, 2024 · 5:00 PM UTC

Jason Ma

@JasonMa2020

22 Nov 2024

Papers covered: Generative Value Learning (GVL): arxiv.org/abs/2411.04549 Language-Image Value Learning (LIV): arxiv.org/abs/2306.00958 Value Implicit-Pretraining (VIP): arxiv.org/abs/2210.00030 Eurekaverse: arxiv.org/abs/2411.01775 DrEureka: arxiv.org/abs/2406.01967 Eureka: arxiv.org/abs/2310.12931 Universal Visual Decomposer (UVD): arxiv.org/abs/2310.08581 Goal-Contrastive Rewards (GCR): arxiv.org/abs/2410.19989

1,657

Jason Ma · Mar 26, 2025 · 4:09 PM UTC

Jason Ma

@JasonMa2020

26 Mar 2025

If our mission excites you, please reach out! We are looking for incredible talents in both AI and robotics! DM me or visit our site for open roles! dyna.co

1,056

Jason Ma · Mar 23, 2024 · 4:39 PM UTC

Jason Ma

@JasonMa2020

23 Mar 2024

Excited to have @ericjang11 visit @GRASPlab @PennEngineers!!

Eric Jang

@ericjang11

23 Mar 2024

I'll be giving a talk at @GRASPlab on Wednesday, 3/27 on the robot learning we're doing at @1x_tech. If you're a researcher at Penn working on similar things I'd love to visit your labs and see what you're working on as well! Please DM grasp.upenn.edu/events/sprin…

3,699

Jason Ma · Dec 2, 2022 · 5:50 AM UTC

Jason Ma

@JasonMa2020

2 Dec 2022

Giving contributed talks on VIP at the Offline RL, Foundation Model for Decision Making, and Deep RL workshops at #NeurIPS2022. Come check out how we can pre-train a value function on passive human data and zero-shot transfer to robotics manipulation!

Vikash Kumar

@Vikashplus

29 Nov 2022

Replying to @Vikashplus

#VIP: Self-supervised pre-trained visual reward and representation for robotics 🎯 DeepRL, OfflineRL, SSL workshops 🔗sites.google.com/view/vip-rl @shagunsodhani @dineshjayaraman @obastani @Vikashplus @yayitsamyzhang @JasonMa2020 nitter.app/JasonMa2020/status/157… ⏯️⏯️

Jason Ma · May 3, 2024 · 4:33 PM UTC

Jason Ma

@JasonMa2020

3 May 2024

We highlight quadruped yoga ball walking. This task is particularly hard because (1) it is a novel task for which the LLM could not have seen human-generated reward functions or DR, and (2) simulation cannot model the deformable surface of an air-inflated ball, making good sim-to-real absolutely necessary. It was not clear at all when we started that this could work, so we are very excited about the result! Here is a comparison against other controllers and we clearly see that alternatives quickly fail and are quite unsafe to deploy.

3,774

Jason Ma · Oct 19, 2023 · 10:46 PM UTC

Jason Ma

@JasonMa2020

19 Oct 2023

UVD code is open-source now: github.com/zcczhang/UVD/ Get your long-horizon demonstrations segmented in few seconds! We support various SOTA pre-trained reps (VIP, R3M, LIV, VC-1, ...) as well as policy backbones (MLP, GPT).

GitHub - zcczhang/UVD: Universal Visual Decomposer: Long-Horizon Manipulation Made Easy

Universal Visual Decomposer: Long-Horizon Manipulation Made Easy - zcczhang/UVD

github.com

Jason Ma

@JasonMa2020

18 Oct 2023

3,487

Jason Ma · Jul 14, 2023 · 3:00 PM UTC

Jason Ma

@JasonMa2020

14 Jul 2023

Replying to @HaqueIshfaq

We are organizing a workshop on goal-conditioned RL! Please do consider submitting/participating if relevant :) more details to follow.

890

Jason Ma · Mar 26, 2025 · 4:09 PM UTC

Jason Ma

@JasonMa2020

26 Mar 2025

We believe mastery of one beats mediocrity in many. We’re laser-focused on achieving general-purpose capabilities by perfecting one task at a time. Today, we’re bringing cost-effective, easy-to-deploy robotics solutions to businesses of all sizes.

3,198

Jason Ma · Mar 26, 2025 · 4:09 PM UTC

Jason Ma

@JasonMa2020

26 Mar 2025

How do you know you have product-market fit? When your robot looks like crap but delivers so much value that customers happily pay for it.

1,075

Jason Ma · Mar 26, 2025 · 4:09 PM UTC

Jason Ma

@JasonMa2020

26 Mar 2025

Also check out the @FortuneMagazine coverage from @sharongoldman ; thanks Sharon for breaking the news: fortune.com/2025/03/25/exclu…

From smart carts to toilet-scrubbing robots: This founder sold to Instacart for $350M—now he’s back...

Startup founder has a new startup building low-cost robots to do the chores, from folding laundry to cleaning toilets.

fortune.com

1,326

Jason Ma · Oct 20, 2023 · 5:38 PM UTC

Jason Ma

@JasonMa2020

20 Oct 2023

This is my internship project at @NVIDIAAI! I had a blast working on it and learned a lot from the experience. I am really grateful to my mentors @AnimaAnandkumar @DrJimFan @yukez for their guidance on the project!

1,039

Jason Ma · Jan 3, 2024 · 5:57 PM UTC

Jason Ma

@JasonMa2020

3 Jan 2024

Replying to @tonyzzhao @zipengfu @chelseabfinn

Congrats Tony!! Very impressive :)

7,478

Jason Ma · Nov 7, 2024 · 8:16 PM UTC

Jason Ma

@JasonMa2020

7 Nov 2024

Replying to @JasonMa2020 @GoogleDeepMind @xf1280

The answer is yes, and the method is simple yet intriguing! We propose formulating value learning as an autoregressive prediction task over *shuffled* sequence of the input video. Why? Think about a standard video showing a task unfolding in chronological order. We empirically find that this actually makes it harder for the VLM to estimate progress because it might just latch onto the order of the frames instead of the underlying changes that signify actual progress towards completing the task. By shuffling, we force the VLM to work “harder” to figure out the correct order based on the visual cues of task progress, and doing so significantly improves the faithfulness of the value predictions! In a way, GVL poses value predictions as an ‘’temporal unshuffling’’ puzzle to the VLM; it has all the pieces, but it has to figure out how those pieces fit together in a way that makes sense based on progress towards a goal.

2,963

Jason Ma · Mar 26, 2025 · 4:09 PM UTC

Jason Ma

@JasonMa2020

26 Mar 2025

After conversations with hundreds of potential customers, we discovered that people don’t care about fancy designs—they just want robots to perform reliably and deliver clear ROI. Our approach? Start simple and excel.

1,230

Jason Ma · Jun 11, 2025 · 5:56 PM UTC

Jason Ma

@JasonMa2020

11 Jun 2025

Nice work on how to go beyond BC by learning to search using world and reward models! The nice thing is that you don't need to collect any additional data.

Gokul Swamy @g_k_swamy

10 Jun 2025

Say ahoy to 𝚂𝙰𝙸𝙻𝙾𝚁⛵: a new paradigm of *learning to search* from demonstrations, enabling test-time reasoning about how to recover from mistakes w/o any additional human feedback! 𝚂𝙰𝙸𝙻𝙾𝚁 ⛵ out-performs Diffusion Policies trained via behavioral cloning on 5-10x data!

2,845

Jason Ma · Oct 30, 2025 · 7:27 PM UTC

Jason Ma

@JasonMa2020

30 Oct 2025

Another video of our VLA's robust reasoning capabilities against real-time disturbances:

937

Jason Ma · Jun 12, 2024 · 7:04 PM UTC

Jason Ma

@JasonMa2020

12 Jun 2024

We are organizing a workshop on task specification at #RSS2024! Consider submitting your latest work to our workshop and attending!

Jason Liu @HRI @jasonxyliu

5 Jun 2024

Submit to our #RSS2024 workshop on “Robotic Tasks and How to Specify Them? Task Specification for General-Purpose Intelligent Robots” by June 12th. Join our discussion on what constitutes various task specifications for robots, in what scenarios they are most effective and more!

3,286