SCDP: Learning Humanoid Locomotion from Partial Observations via Mixed-Observation Distillation
Milo Carroll, Tianhu Peng, Lingfan Bao, Chengxu Zhou, Zhibin Li

TL;DR
This paper introduces SCDP, a diffusion-based policy that enables humanoid robots to learn locomotion solely from onboard sensor data, eliminating the need for complex state estimation and achieving high success in simulation and real-world tests.
Contribution
SCDP is the first approach to learn humanoid locomotion from partial observations using mixed-observation diffusion training, removing reliance on privileged state information.
Findings
Achieves 99-100% success in velocity control in simulation.
Attains 93% success in motion tracking in simulation.
Demonstrates robust real-world humanoid locomotion without external sensors.
Abstract
Distilling humanoid locomotion control from offline datasets into deployable policies remains a challenge, as existing methods rely on privileged full-body states that require complex and often unreliable state estimation. We present Sensor-Conditioned Diffusion Policies (SCDP) that enables humanoid locomotion using only onboard sensors, eliminating the need for explicit state estimation. SCDP decouples sensing from supervision through mixed-observation training: diffusion model conditions on sensor histories while being supervised to predict privileged future state-action trajectories, enforcing the model to infer the motion dynamics under partial observability. We further develop restricted denoising, context distribution alignment, and context-aware attention masking to encourage implicit state estimation within the model and to prevent train-deploy mismatch. We validate SCDP on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Human Pose and Action Recognition · Human Motion and Animation
