Safety aware model-based reinforcement learning for optimal control of a   class of output-feedback nonlinear systems

S M Nahid Mahmud; Moad Abudia; Scott A Nivison; Zachary I. Bell,; Rushikesh Kamalapurkar

arXiv:2110.00271·eess.SY·October 6, 2021

Safety aware model-based reinforcement learning for optimal control of a class of output-feedback nonlinear systems

S M Nahid Mahmud, Moad Abudia, Scott A Nivison, Zachary I. Bell,, Rushikesh Kamalapurkar

PDF

Open Access

TL;DR

This paper introduces a safe model-based reinforcement learning method for nonlinear systems with output feedback, employing a novel dynamic state estimator to enable safe learning and control when full state information is unavailable.

Contribution

It develops a new output-feedback safe reinforcement learning approach using a dynamic state estimator, extending safety guarantees to partially observable nonlinear systems.

Findings

01

Successfully integrates a dynamic state estimator with safe RL.

02

Enables safe control in systems with partial observability.

03

Extends barrier-based safety methods to output-feedback scenarios.

Abstract

The ability to learn and execute optimal control policies safely is critical to realization of complex autonomy, especially where task restarts are not available and/or the systems are safety-critical. Safety requirements are often expressed in terms of state and/or control constraints. Methods such as barrier transformation and control barrier functions have been successfully used, in conjunction with model-based reinforcement learning, for safe learning in systems under state constraints, to learn the optimal control policy. However, existing barrier-based safe learning methods rely on full state feedback. In this paper, an output-feedback safe model-based reinforcement learning technique is developed that utilizes a novel dynamic state estimator to implement simultaneous learning and control for a class of safety-critical systems with partially observable state.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Adaptive Control of Nonlinear Systems