Emergent Dexterity via Diverse Resets and Large-Scale Reinforcement Learning

Patrick Yin; Tyler Westenbroek; Zhengyu Zhang; Joshua Tran; Ignacio Dagnino; Eeshani Shilamkar; Numfor Mbiziwo-Tiapo; Simran Bagaria; Xinlei Liu; Galen Mullins; Andrey Kolobov; Abhishek Gupta

arXiv:2603.15789·cs.RO·April 3, 2026

Emergent Dexterity via Diverse Resets and Large-Scale Reinforcement Learning

Patrick Yin, Tyler Westenbroek, Zhengyu Zhang, Joshua Tran, Ignacio Dagnino, Eeshani Shilamkar, Numfor Mbiziwo-Tiapo, Simran Bagaria, Xinlei Liu, Galen Mullins, Andrey Kolobov, Abhishek Gupta

PDF

1 Repo 1 Video

TL;DR

OmniReset is a scalable reinforcement learning framework that uses systematic simulator resets to enable robust, long-horizon dexterous manipulation policies without task-specific engineering or demonstrations.

Contribution

The paper introduces OmniReset, a novel reset-based approach that simplifies exploration and scales to complex manipulation tasks using minimal human input.

Findings

01

OmniReset outperforms existing methods on long-horizon manipulation tasks.

02

Policies trained with OmniReset transfer effectively to real-world zero-shot.

03

OmniReset achieves broader behavioral coverage with minimal human-designed resets.

Abstract

Reinforcement learning in massively parallel physics simulations has driven major progress in sim-to-real robot learning. However, current approaches remain brittle and task-specific, relying on extensive per-task engineering to design rewards, curricula, and demonstrations. Even with this engineering, they often fail on long-horizon, contact-rich manipulation tasks and do not meaningfully scale with compute, as performance quickly saturates when training revisits the same narrow regions of state space. We introduce OmniReset, a simple and scalable framework that enables on-policy reinforcement learning to robustly solve a broad class of dexterous manipulation tasks using a single reward function, fixed algorithm hyperparameters, no curricula, and no human demonstrations. Our key insight is that long-horizon exploration can be dramatically simplified by using simulator resets to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://weirdlabuw.github.io/omnireset
github

Videos

Emergent Dexterity Via Diverse Resets and Large-Scale Reinforcement Learning· slideslive