EasyMimic: A Low-Cost Framework for Robot Imitation Learning from Human Videos
Tao Zhang, Song Xia, Ye Wang, Qin Jin

TL;DR
EasyMimic is a low-cost, user-friendly framework that enables robots to learn manipulation tasks from human videos using RGB cameras, reducing the need for expensive data collection.
Contribution
The paper introduces EasyMimic, a novel framework that leverages human video demonstrations and simple augmentation to facilitate robot imitation learning on low-cost platforms.
Findings
Achieves high performance on manipulation tasks with minimal robot data
Effectively bridges human-robot domain gap with visual augmentation
Reduces data collection costs significantly
Abstract
Robot imitation learning is often hindered by the high cost of collecting large-scale, real-world data. This challenge is especially significant for low-cost robots designed for home use, as they must be both user-friendly and affordable. To address this, we propose the EasyMimic framework, a low-cost and replicable solution that enables robots to quickly learn manipulation policies from human video demonstrations captured with standard RGB cameras. Our method first extracts 3D hand trajectories from the videos. An action alignment module then maps these trajectories to the gripper control space of a low-cost robot. To bridge the human-to-robot domain gap, we introduce a simple and user-friendly hand visual augmentation strategy. We then use a co-training method, fine-tuning a model on both the processed human data and a small amount of robot data, enabling rapid adaptation to new…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Social Robot Interaction and HRI
