ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos
Zerui Chen, Shizhe Chen, Etienne Arlaud, Ivan Laptev, Cordelia Schmid

TL;DR
This paper introduces ViViDex, a framework that learns vision-based dexterous manipulation policies from human videos, overcoming noise and privileged information limitations to achieve superior performance in simulation and real-world tasks.
Contribution
ViViDex combines reinforcement learning with trajectory-guided rewards and a coordinate transformation to train unified visual policies from human videos without privileged information.
Findings
Outperforms state-of-the-art methods in three manipulation tasks
Effective in both simulation and real robot experiments
Improves visual policy learning from noisy human videos
Abstract
In this work, we aim to learn a unified vision-based policy for multi-fingered robot hands to manipulate a variety of objects in diverse poses. Though prior work has shown benefits of using human videos for policy learning, performance gains have been limited by the noise in estimated trajectories. Moreover, reliance on privileged object information such as ground-truth object states further limits the applicability in realistic scenarios. To address these limitations, we propose a new framework ViViDex to improve vision-based policy learning from human videos. It first uses reinforcement learning with trajectory guided rewards to train state-based policies for each video, obtaining both visually natural and physically plausible trajectories from the video. We then rollout successful episodes from state-based policies and train a unified visual policy without using any privileged…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics
MethodsDiffusion
