Safe Learning of Locomotion Skills from MPC
Xun Pua, Majid Khadiv

TL;DR
This paper presents a safe learning framework for locomotion skills using MPC as an expert, significantly reducing failures during training and enhancing robustness to disturbances compared to standard methods.
Contribution
The work introduces a novel safe learning approach combining MPC with SafeDAGGER to improve safety and robustness in locomotion skill acquisition.
Findings
Fewer failures during training compared to baseline methods
Resulting policies are more robust to external disturbances
Outperforms behavior cloning and vanilla DAGGER in safety and robustness
Abstract
Safe learning of locomotion skills is still an open problem. Indeed, the intrinsically unstable nature of the open-loop dynamics of locomotion systems renders naive learning from scratch prone to catastrophic failures in the real world. In this work, we investigate the use of iterative algorithms to safely learn locomotion skills from model predictive control (MPC). In our framework, we use MPC as an expert and take inspiration from the safe data aggregation (SafeDAGGER) framework to minimize the number of failures during training of the policy. Through a comparison with other standard approaches such as behavior cloning and vanilla DAGGER, we show that not only our approach has a substantially fewer number of failures during training, but the resulting policy is also more robust to external disturbances.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Automated Systems · Robotic Mechanisms and Dynamics
