One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation

Juncheng Mu; Sizhe Yang; Hojin Bae; Feiyu Jia; Qingwei Ben; Boyi Li; Huazhe Xu; Jiangmiao Pang

arXiv:2603.14522·cs.RO·March 17, 2026

One-Policy-Fits-All: Geometry-Aware Action Latents for Cross-Embodiment Manipulation

Juncheng Mu, Sizhe Yang, Hojin Bae, Feiyu Jia, Qingwei Ben, Boyi Li, Huazhe Xu, Jiangmiao Pang

PDF

Open Access

TL;DR

This paper introduces OPFA, a framework that learns a single, geometry-aware policy across multiple robot embodiments, significantly improving data efficiency and enabling effective skill transfer in cross-embodiment manipulation tasks.

Contribution

The paper proposes a novel Geometry-Aware Latent Representation and a unified decoder, allowing end-to-end training of a versatile policy across diverse robot embodiments without embodiment-specific tuning.

Findings

01

Cross-embodiment co-training improves success rates by over 50%.

02

Adding a few demonstrations from a new embodiment matches performance of extensive training.

03

OPFA reduces data requirements and enhances policy generalization across 11 different end-effectors.

Abstract

Cross-embodiment manipulation is crucial for enhancing the scalability of robot manipulation and reducing the high cost of data collection. However, the significant differences between embodiments, such as variations in action spaces and structural disparities, pose challenges for joint training across multiple sources of data. To address this, we propose One-Policy-Fits-All (OPFA), a framework that enables learning a single, versatile policy across multiple embodiments. We first learn a Geometry-Aware Latent Representation (GaLR), which leverages 3D convolution networks and transformers to build a shared latent action space across different embodiments. Then we design a unified latent retargeting decoder that extracts embodiment-specific actions from the latent representations, without any embodiment-specific decoder tuning. OPFA enables end-to-end co-training of data from diverse…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · 3D Shape Modeling and Analysis · Human Pose and Action Recognition