EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration
Modi Shi, Shijia Peng, Jin Chen, Haoran Jiang, Yinghui Li, Di Huang, Ping Luo, Hongyang Li, Li Chen

TL;DR
EgoHumanoid leverages abundant human egocentric demonstrations combined with limited robot data to enable humanoids to perform loco-manipulation in diverse real-world environments, addressing the embodiment gap through a systematic alignment pipeline.
Contribution
This work introduces a novel framework that co-trains vision-language-action policies using human demonstrations and a new alignment pipeline for humanoid control.
Findings
Incorporating human data improves performance by 51% over robot-only baselines.
The alignment pipeline effectively bridges the embodiment gap between humans and robots.
Humans can transfer behaviors efficiently to humanoids in unseen environments.
Abstract
Human demonstrations offer rich environmental diversity and scale naturally, making them an appealing alternative to robot teleoperation. While this paradigm has advanced robot-arm manipulation, its potential for the more challenging, data-hungry problem of humanoid loco-manipulation remains largely unexplored. We present EgoHumanoid, the first framework to co-train a vision-language-action policy using abundant egocentric human demonstrations together with a limited amount of robot data, enabling humanoids to perform loco-manipulation across diverse real-world environments. To bridge the embodiment gap between humans and robots, including discrepancies in physical morphology and viewpoint, we introduce a systematic alignment pipeline spanning from hardware design to data processing. A portable system for scalable human data collection is developed, and we establish practical collection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Social Robot Interaction and HRI · Human Motion and Animation
