Closing the Sim2Real Performance Gap in RL
Akhil S Anand, Shambhuraj Sawant, Jasper Hoffmann, Dirk Reinhardt, Sebastien Gros

TL;DR
This paper introduces a bi-level RL framework that directly adapts simulation parameters based on real-world performance to reduce the Sim2Real gap in reinforcement learning policies.
Contribution
It proposes a novel bi-level RL approach that optimizes simulation parameters with respect to real-world performance, addressing limitations of existing proxy metrics.
Findings
Bi-level RL effectively reduces the Sim2Real performance gap.
Mathematical tools for bi-level RL algorithms are derived and validated.
The approach outperforms traditional simulation accuracy metrics.
Abstract
Sim2Real aims at training policies in high-fidelity simulation environments and effectively transferring them to the real world. Despite the developments of accurate simulators and Sim2Real RL approaches, the policies trained purely in simulation often suffer significant performance drops when deployed in real environments. This drop is referred to as the Sim2Real performance gap. Current Sim2Real RL methods optimize the simulator accuracy and variability as proxies for real-world performance. However, these metrics do not necessarily correlate with the real-world performance of the policy as established theoretically and empirically in the literature. We propose a novel framework to address this issue by directly adapting the simulator parameters based on real-world performance. We frame this problem as a bi-level RL framework: the inner-level RL trains a policy purely in simulation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Reinforcement Learning in Robotics · Model Reduction and Neural Networks
