LessMimic: Long-Horizon Humanoid Interaction with Unified Distance Field Representations
Yutang Lin, Jieming Cui, Yixuan Li, Baoxiong Jia, Yixin Zhu, Siyuan Huang

TL;DR
LessMimic introduces a unified, reference-free approach for long-horizon humanoid interaction using Distance Fields, enabling geometric generalization, skill composition, and vision-only deployment without motion capture.
Contribution
It presents a novel Distance Field-based policy framework that generalizes across object scales, composes multiple skills, and transfers seamlessly to vision-only systems.
Findings
Achieves 80-100% success on PickUp and SitStand tasks across various object sizes.
Maintains viability up to 40 sequentially composed tasks.
Outperforms baselines sharply degrading with object scale variations.
Abstract
Humanoid robots that autonomously interact with physical environments over extended horizons represent a central goal of embodied intelligence. Existing approaches rely on reference motions or task-specific rewards, tightly coupling policies to particular object geometries and precluding multi-skill generalization within a single framework. A unified interaction representation enabling reference-free inference, geometric generalization, and long-horizon skill composition within one policy remains an open challenge. Here we show that Distance Field (DF) provides such a representation: LessMimic conditions a single whole-body policy on DF-derived geometric cues--surface distances, gradients, and velocity decompositions--removing the need for motion references, with interaction latents encoded via a Variational Auto-Encoder (VAE) and post-trained using Adversarial Interaction Priors (AIP)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Social Robot Interaction and HRI · Human Pose and Action Recognition
