Infinite-Horizon Reach-Avoid Zero-Sum Games via Deep Reinforcement Learning
Jingqi Li, Donggun Lee, Somayeh Sojoudi, Claire J. Tomlin

TL;DR
This paper introduces a deep reinforcement learning approach to compute reach-avoid sets in infinite-horizon zero-sum games, enabling control under worst-case disturbances with theoretical guarantees.
Contribution
It develops a new value function with a contracting Bellman backup and extends Conservative Q-Learning to high-dimensional reach-avoid problems.
Findings
The method reliably learns reach-avoid sets with neural networks.
The approach can compute viability kernels and backward reachable sets.
Empirical results demonstrate effectiveness in complex scenarios.
Abstract
In this paper, we consider the infinite-horizon reach-avoid zero-sum game problem, where the goal is to find a set in the state space, referred to as the reach-avoid set, such that the system starting at a state therein could be controlled to reach a given target set without violating constraints under the worst-case disturbance. We address this problem by designing a new value function with a contracting Bellman backup, where the super-zero level set, i.e., the set of states where the value function is evaluated to be non-negative, recovers the reach-avoid set. Building upon this, we prove that the proposed method can be adapted to compute the viability kernel, or the set of states which could be controlled to satisfy given constraints, and the backward reachable set, or the set of states that could be driven towards a given target set. Finally, we propose to alleviate the curse of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research
MethodsQ-Learning
