Bypassing the Simulation-to-reality Gap: Online Reinforcement Learning   using a Supervisor

Benjamin David Evans; Johannes Betz; Hongrui Zheng; Herman A.; Engelbrecht; Rahul Mangharam; and Hendrik W. Jordaan

arXiv:2209.11082·cs.RO·July 14, 2023

Bypassing the Simulation-to-reality Gap: Online Reinforcement Learning using a Supervisor

Benjamin David Evans, Johannes Betz, Hongrui Zheng, Herman A., Engelbrecht, Rahul Mangharam, and Hendrik W. Jordaan

PDF

Open Access

TL;DR

This paper introduces an online reinforcement learning approach for autonomous vehicle control that uses a safety supervisor to bypass the sim-to-real gap, enabling safe, efficient training directly on physical robots.

Contribution

The paper presents a novel online DRL training method with a safety supervisor that ensures safety and improves learning efficiency without prior simulation training.

Findings

01

Enhanced sample efficiency in training

02

Agents never crash during training

03

Better driving performance compared to simulation-trained agents

Abstract

Deep reinforcement learning (DRL) is a promising method to learn control policies for robots only from demonstration and experience. To cover the whole dynamic behaviour of the robot, DRL training is an active exploration process typically performed in simulation environments. Although this simulation training is cheap and fast, applying DRL algorithms to real-world settings is difficult. If agents are trained until they perform safely in simulation, transferring them to physical systems is difficult due to the sim-to-real gap caused by the difference between the simulation dynamics and the physical robot. In this paper, we present a method of online training a DRL agent to drive autonomously on a physical vehicle by using a model-based safety supervisor. Our solution uses a supervisory system to check if the action selected by the agent is safe or unsafe and ensure that a safe action…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Reinforcement Learning in Robotics