TL;DR
This survey reviews recent advances in robot learning from human videos, highlighting techniques, datasets, and future challenges to enable scalable robot skill acquisition from passive human demonstrations.
Contribution
It provides a comprehensive taxonomy of methods, datasets, and analysis of the field of learning robot skills from human videos, emphasizing future research directions.
Findings
Hierarchical taxonomy of human-video-to-robot-skill transfer pathways
Analysis of widely-used human video datasets and generation schemes
Identification of intrinsic challenges and limitations in the field
Abstract
A critical bottleneck hindering further advancement in embodied AI and robotics is the challenge of scaling robot data. To address this, the field of learning robot manipulation skills from human video data has attracted rapidly growing attention in recent years, driven by the abundance of human activity videos and advances in computer vision. This line of research promises to enable robots to acquire skills passively from the vast and readily available resource of human demonstrations, substantially favoring scalable learning for generalist robotic systems. Therefore, we present this survey to provide a comprehensive and up-to-date review of human-video-based learning techniques in robotics, focusing on both human-robot skill transfer and data foundations. We first review the policy learning foundations in robotics, and then describe the fundamental interfaces to incorporate human…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
