Embodiment-Aware Generalist Specialist Distillation for Unified Humanoid Whole-Body Control
Quanquan Peng, Yunfeng Lin, Yufei Xue, Jiangmiao Pang, Weinan Zhang

TL;DR
This paper introduces EAGLE, a distillation framework that creates a unified humanoid control policy capable of managing diverse robots and complex behaviors without extensive tuning, advancing scalable humanoid control.
Contribution
EAGLE is the first iterative distillation method that produces a single policy controlling multiple heterogeneous humanoids with rich behaviors, without per-robot reward tuning.
Findings
Achieves high tracking accuracy across multiple robots
Demonstrates robustness in real-world experiments
Outperforms existing methods in simulation and real settings
Abstract
Humanoid Whole-Body Controllers trained with reinforcement learning (RL) have recently achieved remarkable performance, yet many target a single robot embodiment. Variations in dynamics, degrees of freedom (DoFs), and kinematic topology still hinder a single policy from commanding diverse humanoids. Moreover, obtaining a generalist policy that not only transfers across embodiments but also supports richer behaviors-beyond simple walking to squatting, leaning-remains especially challenging. In this work, we tackle these obstacles by introducing EAGLE, an iterative generalist-specialist distillation framework that produces a single unified policy that controls multiple heterogeneous humanoids without per-robot reward tuning. During each cycle, embodiment-specific specialists are forked from the current generalist, refined on their respective robots, and new skills are distilled back into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics · Robot Manipulation and Learning
