Embedding Morphology into Transformers for Cross-Robot Policy Learning

Kei Suzuki; Jing Liu; Ye Wang; Chiori Hori; Matthew Brand; Diego Romeres; Toshiaki Koike-Akino

arXiv:2603.00182·cs.RO·March 3, 2026

Embedding Morphology into Transformers for Cross-Robot Policy Learning

Kei Suzuki, Jing Liu, Ye Wang, Chiori Hori, Matthew Brand, Diego Romeres, Toshiaki Koike-Akino

PDF

Open Access

TL;DR

This paper introduces an embodiment-aware transformer policy for cross-robot learning, integrating morphology through kinematic tokens, topology-aware attention, and joint attributes to enhance robustness and performance.

Contribution

It presents a novel transformer architecture that explicitly incorporates robot morphology, improving cross-embodiment policy learning in robotics.

Findings

01

Improved robustness across multiple robot embodiments.

02

Enhanced performance over baseline models within single embodiments.

03

Structured morphology integration benefits in policy generalization.

Abstract

Cross-robot policy learning -- training a single policy to perform well across multiple embodiments -- remains a central challenge in robot learning. Transformer-based policies, such as vision-language-action (VLA) models, are typically embodiment-agnostic and must infer kinematic structure purely from observations, which can reduce robustness across embodiments and even limit performance within a single embodiment. We propose an embodiment-aware transformer policy that injects morphology via three mechanisms: (1) kinematic tokens that factorize actions across joints and compress time through per-joint temporal chunking; (2) a topology-aware attention bias that encodes kinematic topology as an inductive bias in self-attention, encouraging message passing along kinematic edges; and (3) joint-attribute conditioning that augments topology with per-joint descriptors to capture semantics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Robot Manipulation and Learning