TL;DR
RoSHI is a hybrid wearable system combining IMUs and egocentric glasses to accurately estimate full 3D human pose in real-world settings, facilitating robot learning and human data collection.
Contribution
The paper introduces RoSHI, a novel hybrid wearable system that fuses IMUs with egocentric perception to improve 3D human pose estimation in natural environments.
Findings
RoSHI outperforms other egocentric baselines in pose estimation.
RoSHI performs comparably to state-of-the-art exocentric methods.
Recorded motion data is effective for humanoid policy learning.
Abstract
Scaling up robot learning will likely require human data containing rich and long-horizon interactions in the wild. Existing approaches for collecting such data trade off portability, robustness to occlusion, and global consistency. We introduce RoSHI, a hybrid wearable that fuses low-cost sparse IMUs with the Project Aria glasses to estimate the full 3D pose and body shape of the wearer in a metric global coordinate frame from egocentric perception. This system is motivated by the complementarity of the two sensors: IMUs provide robustness to occlusions and high-speed motions, while egocentric SLAM anchors long-horizon motion and stabilizes upper body pose. We collect a dataset of agile activities to evaluate RoSHI. On this dataset, we generally outperform other egocentric baselines and perform comparably to a state-of-the-art exocentric baseline (SAM3D). Finally, we demonstrate that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
