Disturbance Observer-based Control Barrier Functions with Residual Model   Learning for Safe Reinforcement Learning

Dvij Kalaria; Qin Lin; John M. Dolan

arXiv:2410.06570·cs.RO·October 10, 2024

Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning

Dvij Kalaria, Qin Lin, John M. Dolan

PDF

Open Access

TL;DR

This paper introduces a safe reinforcement learning framework that combines disturbance observer-based control barrier functions with residual model learning, enabling robust safety guarantees despite model uncertainties and disturbances.

Contribution

It proposes a novel framework integrating DOB and residual model learning for safe RL, improving robustness and safety in real-world applications.

Findings

01

Outperforms state-of-the-art methods on Safety-gym benchmarks

02

Effective in real-world F1/10 racing car experiments

03

Provides strong safety guarantees under model uncertainties

Abstract

Reinforcement learning (RL) agents need to explore their environment to learn optimal behaviors and achieve maximum rewards. However, exploration can be risky when training RL directly on real systems, while simulation-based training introduces the tricky issue of the sim-to-real gap. Recent approaches have leveraged safety filters, such as control barrier functions (CBFs), to penalize unsafe actions during RL training. However, the strong safety guarantees of CBFs rely on a precise dynamic model. In practice, uncertainties always exist, including internal disturbances from the errors of dynamics and external disturbances such as wind. In this work, we propose a new safe RL framework based on disturbance rejection-guarded learning, which allows for an almost model-free RL with an assumed but not necessarily precise nominal dynamic model. We demonstrate our results on the Safety-gym…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization