Learning to Provably Satisfy High Relative Degree Constraints for   Black-Box Systems

Jean-Baptiste Bouvier; Kartik Nagpal; Negar Mehr

arXiv:2407.20456·eess.SY·July 31, 2024

Learning to Provably Satisfy High Relative Degree Constraints for Black-Box Systems

Jean-Baptiste Bouvier, Kartik Nagpal, Negar Mehr

PDF

Open Access

TL;DR

This paper introduces a novel reinforcement learning algorithm that guarantees the satisfaction of high relative degree affine state constraints in black-box control systems, addressing limitations of previous methods.

Contribution

The paper presents a new RL method that enforces high relative degree constraints, extending the capabilities of existing approaches like POLICEd RL.

Findings

01

Successfully enforced constraints in inverted pendulum simulation.

02

Demonstrated constraint satisfaction in space shuttle landing simulation.

03

Proved guarantees for deterministic systems regardless of RL training algorithm.

Abstract

In this paper, we develop a method for learning a control policy guaranteed to satisfy an affine state constraint of high relative degree in closed loop with a black-box system. Previous reinforcement learning (RL) approaches to satisfy safety constraints either require access to the system model, or assume control affine dynamics, or only discourage violations with reward shaping. Only recently have these issues been addressed with POLICEd RL, which guarantees constraint satisfaction for black-box systems. However, this previous work can only enforce constraints of relative degree 1. To address this gap, we build a novel RL algorithm explicitly designed to enforce an affine state constraint of high relative degree in closed loop with a black-box control system. Our key insight is to make the learned policy be affine around the unsafe set and to use this affine region to dissipate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScheduling and Optimization Algorithms

MethodsSparse Evolutionary Training