BifrostUMI: Bridging Robot-Free Demonstrations and Humanoid Whole-Body Manipulation
Chenhao Yu, Hongwu Wang, Youhao Hu, Jiachen Zhang, Yuanyuan Li, Shaqi Luo

TL;DR
BifrostUMI introduces a portable, robot-free data collection framework using VR for humanoid robots, enabling efficient transfer of human demonstrations to robot behaviors.
Contribution
It presents a novel, multimodal data collection method leveraging VR and keypoint retargeting for humanoid manipulation, bypassing traditional robot-dependent data acquisition.
Findings
Effective transfer of human demonstrations to humanoid robots.
Versatile application across different experimental scenarios.
High-quality data enables improved visuomotor policy training.
Abstract
High-quality data collection is a fundamental cornerstone for training humanoid whole-body visuomotor policies. Current data acquisition paradigms predominantly rely on robot teleoperation, which is often hindered by limited hardware accessibility and low operational efficiency. Inspired by the Universal Manipulation Interface (UMI), we propose BifrostUMI, a portable, efficient, and robot-free data collection framework tailored for humanoid robots. BifrostUMI leverages lightweight VR devices to capture human demonstrations as sparse keypoint trajectories while simultaneously recording wrist-mounted visual data. These multimodal data are subsequently utilized to train a high-level policy network that predicts future keypoint trajectories conditioned on the captured visual features. Through a robust keypoint retargeting pipeline, keypoint trajectories are precisely mapped onto the robot's…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
