Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning
Severin Bochem, Eduardo Gonzalez-Sanchez, Yves Bicker, Gabriele, Fadini

TL;DR
This paper presents a novel sharpness-aware reinforcement learning method that improves the robustness and generalization of robot locomotion policies in contact-rich environments, while maintaining sample efficiency.
Contribution
It introduces a sharpness-aware optimization technique into reinforcement learning to find flatter minima, enhancing policy robustness and generalization in robotics simulation-to-real transfer.
Findings
Enhanced policy robustness to environmental variations
Improved action noise tolerance over standard methods
Achieved generalization comparable to zeroth-order approaches
Abstract
Reinforcement learning often requires extensive training data. Simulation-to-real transfer offers a promising approach to address this challenge in robotics. While differentiable simulators offer improved sample efficiency through exact gradients, they can be unstable in contact-rich environments and may lead to poor generalization. This paper introduces a novel approach integrating sharpness-aware optimization into gradient-based reinforcement learning algorithms. Our simulation results demonstrate that our method, tested on contact-rich environments, significantly enhances policy robustness to environmental variations and action perturbations while maintaining the sample efficiency of first-order methods. Specifically, our approach improves action noise tolerance compared to standard first-order methods and achieves generalization comparable to zeroth-order methods. This improvement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Robotic Locomotion and Control
