Learning a Low-dimensional Representation of a Safe Region for Safe   Reinforcement Learning on Dynamical Systems

Zhehua Zhou; Ozgur S. Oguz; Marion Leibold; Martin Buss

arXiv:2010.09555·cs.RO·September 9, 2021

Learning a Low-dimensional Representation of a Safe Region for Safe Reinforcement Learning on Dynamical Systems

Zhehua Zhou, Ozgur S. Oguz, Marion Leibold, Martin Buss

PDF

TL;DR

This paper introduces a data-driven method to efficiently learn and adapt a low-dimensional safe region representation for reinforcement learning in complex dynamical systems, demonstrated on a quadcopter example.

Contribution

It proposes a general online adaptation approach to accurately identify safe regions, enhancing the applicability of safe reinforcement learning frameworks.

Findings

01

More reliable safe region representations were obtained.

02

The approach extended safe RL to complex systems.

03

Demonstrated effectiveness on a quadcopter example.

Abstract

For safely applying reinforcement learning algorithms on high-dimensional nonlinear dynamical systems, a simplified system model is used to formulate a safe reinforcement learning framework. Based on the simplified system model, a low-dimensional representation of the safe region is identified and is used to provide safety estimates for learning algorithms. However, finding a satisfying simplified system model for complex dynamical systems usually requires a considerable amount of effort. To overcome this limitation, we propose in this work a general data-driven approach that is able to efficiently learn a low-dimensional representation of the safe region. Through an online adaptation method, the low-dimensional representation is updated by using the feedback data such that more accurate safety estimates are obtained. The performance of the proposed approach for identifying the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.