RL-Augmented MPC for Non-Gaited Legged and Hybrid Locomotion
Andrea Patrizi, Carlo Rizzardo, Arturo Laurenzi, Francesco Ruscelli, Luca Rossini, Nikos G. Tsagarakis

TL;DR
This paper introduces a hierarchical RL-MPC architecture for legged and hybrid robots, enabling effective gait learning and transfer from simulation to real-world without domain randomization.
Contribution
It presents a contact-explicit hierarchical framework coupling RL and MPC, allowing zero-shot sim-to-real transfer and minimal reward tuning for diverse robotic platforms.
Findings
Emergence of acyclic gaits and timing adaptations in simulation
Zero-shot sim-to-sim transfer across platforms
Successful zero-shot sim-to-real transfer on a humanoid robot
Abstract
We propose a contact-explicit hierarchical architecture coupling Reinforcement Learning (RL) and Model Predictive Control (MPC), where a high-level RL agent provides gait and navigation commands to a low-level locomotion MPC. This offloads the combinatorial burden of contact timing from the MPC by learning acyclic gaits through trial and error in simulation. We show that only a minimal set of rewards and limited tuning are required to obtain effective policies. We validate the architecture in simulation across robotic platforms spanning 50 kg to 120 kg and different MPC implementations, observing the emergence of acyclic gaits and timing adaptations in flat-terrain legged and hybrid locomotion, and further demonstrating extensibility to non-flat terrains. Across all platforms, we achieve zero-shot sim-to-sim transfer without domain randomization, and we further demonstrate zero-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Prosthetics and Rehabilitation Robotics · Reinforcement Learning in Robotics
