HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations

Xiaomeng Xu; Jisang Park; Han Zhang; Eric Cousineau; Aditya Bhat; Jose Barreiros; Dian Wang; Jeannette Bohg; Shuran Song

arXiv:2603.03243·cs.RO·May 18, 2026

HoMMI: Learning Whole-Body Mobile Manipulation from Human Demonstrations

Xiaomeng Xu, Jisang Park, Han Zhang, Eric Cousineau, Aditya Bhat, Jose Barreiros, Dian Wang, Jeannette Bohg, Shuran Song

PDF

2 Repos

TL;DR

HoMMI introduces a scalable framework for learning whole-body mobile manipulation directly from human demonstrations, utilizing egocentric sensing and cross-embodiment policy design to enable complex robotic tasks.

Contribution

The paper presents a novel data collection and policy learning framework that bridges the embodiment gap for robot-free human demonstrations in mobile manipulation.

Findings

01

Enables long-horizon bimanual and whole-body manipulation tasks.

02

Uses egocentric sensing for global context in data collection.

03

Achieves effective policy transfer through cross-embodiment design.

Abstract

We present Whole-Body Mobile Manipulation Interface (HoMMI), a data collection and policy learning framework that learns whole-body mobile manipulation directly from robot-free human demonstrations. We augment UMI interfaces with egocentric sensing to capture the global context required for mobile manipulation, enabling portable, robot-free, and scalable data collection. However, naively incorporating egocentric sensing introduces a larger human-to-robot embodiment gap in both observation and action spaces, making policy transfer difficult. We explicitly bridge this gap with a cross-embodiment hand-eye policy design, including an embodiment agnostic visual representation; a relaxed head action representation; and a whole-body controller that realizes hand-eye trajectories through coordinated whole-body motion under robot-specific physical constraints. Together, these enable long-horizon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Social Robot Interaction and HRI · Motor Control and Adaptation