Efficiently Learning Robust Torque-based Locomotion Through Reinforcement with Model-Based Supervision
Yashuai Yan, Tobias Egle, Christian Ott, and Dongheui Lee

TL;DR
This paper introduces a hybrid control framework combining model-based control and reinforcement learning to enable robust, adaptive bipedal walking that generalizes well across uncertainties and sim-to-real transfer scenarios.
Contribution
It presents a novel supervised residual RL approach guided by a model-based oracle policy for improved robustness in bipedal locomotion.
Findings
Enhanced robustness across randomized conditions
Effective sim-to-real transfer demonstrated
Supervised residual learning accelerates policy training
Abstract
We propose a control framework that integrates model-based bipedal locomotion with residual reinforcement learning (RL) to achieve robust and adaptive walking in the presence of real-world uncertainties. Our approach leverages a model-based controller, comprising a Divergent Component of Motion (DCM) trajectory planner and a whole-body controller, as a reliable base policy. To address the uncertainties of inaccurate dynamics modeling and sensor noise, we introduce a residual policy trained through RL with domain randomization. Crucially, we employ a model-based oracle policy, which has privileged access to ground-truth dynamics during training, to supervise the residual policy via a novel supervised loss. This supervision enables the policy to efficiently learn corrective behaviors that compensate for unmodeled effects without extensive reward shaping. Our method demonstrates improved…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Prosthetics and Rehabilitation Robotics · Reinforcement Learning in Robotics
