Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement   Learning

Jingqi Li; Donggun Lee; Somayeh Sojoudi; Claire J. Tomlin

arXiv:2203.10142·eess.SY·September 19, 2024·5 cites

Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning

Jingqi Li, Donggun Lee, Somayeh Sojoudi, Claire J. Tomlin

PDF

Open Access

TL;DR

This paper introduces a deep reinforcement learning approach to compute reach-avoid sets in infinite-horizon zero-sum games, enabling control under worst-case disturbances with theoretical guarantees.

Contribution

It develops a new value function with a contracting Bellman backup and extends Conservative Q-Learning to high-dimensional reach-avoid problems.

Findings

01

The method reliably learns reach-avoid sets with neural networks.

02

The approach can compute viability kernels and backward reachable sets.

03

Empirical results demonstrate effectiveness in complex scenarios.

Abstract

In this paper, we consider the infinite-horizon reach-avoid zero-sum game problem, where the goal is to find a set in the state space, referred to as the reach-avoid set, such that the system starting at a state therein could be controlled to reach a given target set without violating constraints under the worst-case disturbance. We address this problem by designing a new value function with a contracting Bellman backup, where the super-zero level set, i.e., the set of states where the value function is evaluated to be non-negative, recovers the reach-avoid set. Building upon this, we prove that the proposed method can be adapted to compute the viability kernel, or the set of states which could be controlled to satisfy given constraints, and the backward reachable set, or the set of states that could be driven towards a given target set. Finally, we propose to alleviate the curse of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research

MethodsQ-Learning