SafeAPT: Safe Simulation-to-Real Robot Learning using Diverse Policies Learned in Simulation
Rituraj Kaushik, Karol Arndt, Ville Kyrki

TL;DR
SafeAPT is a novel algorithm that safely transfers diverse policies learned in simulation to real robots by using probabilistic models and Bayesian optimization, enabling quick adaptation with minimal safety violations.
Contribution
SafeAPT introduces a safety-aware transfer learning method that leverages simulated policy repertoires and probabilistic models for safe real-world robot adaptation.
Findings
SafeAPT finds high-performance policies within minutes in real-world tests.
It minimizes safety violations during adaptation.
SafeAPT outperforms several baseline methods in safety and efficiency.
Abstract
The framework of Simulation-to-real learning, i.e, learning policies in simulation and transferring those policies to the real world is one of the most promising approaches towards data-efficient learning in robotics. However, due to the inevitable reality gap between the simulation and the real world, a policy learned in the simulation may not always generate a safe behaviour on the real robot. As a result, during adaptation of the policy in the real world, the robot may damage itself or cause harm to its surroundings. In this work, we introduce a novel learning algorithm called SafeAPT that leverages a diverse repertoire of policies evolved in the simulation and transfers the most promising safe policy to the real robot through episodic interaction. To achieve this, SafeAPT iteratively learns a probabilistic reward model as well as a safety model using real-world observations combined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning in Healthcare · Adversarial Robustness in Machine Learning
