Safety aware model-based reinforcement learning for optimal control of a class of output-feedback nonlinear systems
S M Nahid Mahmud, Moad Abudia, Scott A Nivison, Zachary I. Bell,, Rushikesh Kamalapurkar

TL;DR
This paper introduces a safe model-based reinforcement learning method for nonlinear systems with output feedback, employing a novel dynamic state estimator to enable safe learning and control when full state information is unavailable.
Contribution
It develops a new output-feedback safe reinforcement learning approach using a dynamic state estimator, extending safety guarantees to partially observable nonlinear systems.
Findings
Successfully integrates a dynamic state estimator with safe RL.
Enables safe control in systems with partial observability.
Extends barrier-based safety methods to output-feedback scenarios.
Abstract
The ability to learn and execute optimal control policies safely is critical to realization of complex autonomy, especially where task restarts are not available and/or the systems are safety-critical. Safety requirements are often expressed in terms of state and/or control constraints. Methods such as barrier transformation and control barrier functions have been successfully used, in conjunction with model-based reinforcement learning, for safe learning in systems under state constraints, to learn the optimal control policy. However, existing barrier-based safe learning methods rely on full state feedback. In this paper, an output-feedback safe model-based reinforcement learning technique is developed that utilizes a novel dynamic state estimator to implement simultaneous learning and control for a class of safety-critical systems with partially observable state.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Adaptive Control of Nonlinear Systems
