R2PS: Worst-Case Robust Real-Time Pursuit Strategies under Partial Observability

Runyu Lu; Ruochuan Shi; Yuanheng Zhu; Dongbin Zhao

arXiv:2511.17367·cs.LG·May 15, 2026

R2PS: Worst-Case Robust Real-Time Pursuit Strategies under Partial Observability

Runyu Lu, Ruochuan Shi, Yuanheng Zhu, Dongbin Zhao

PDF

1 Video

TL;DR

This paper develops a real-time pursuit strategy for pursuit-evasion games under partial observability, combining dynamic programming and reinforcement learning to achieve robustness and generalization to unseen graph structures.

Contribution

It introduces the first worst-case robust pursuit strategy framework under partial observability, extending dynamic programming with belief preservation and integrating it into an RL scheme.

Findings

01

Policy outperforms existing approaches on unseen graphs.

02

Achieves robust zero-shot generalization.

03

Effective in real-world graph structures.

Abstract

Computing worst-case robust strategies in pursuit-evasion games (PEGs) is time-consuming, especially when real-world factors like partial observability are considered. While important for general security purposes, real-time applicable pursuit strategies for graph-based PEGs are currently missing when the pursuers only have imperfect information about the evader's position. Although state-of-the-art reinforcement learning (RL) methods like Equilibrium Policy Generalization (EPG) and Grasper provide guidelines for learning graph neural network (GNN) policies robust to different game dynamics, they are restricted to the scenario of perfect information and do not take into account the possible case where the evader can predict the pursuers' actions. This paper introduces the first approach to worst-case robust real-time pursuit strategies (R2PS) under partial observability. We first prove…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

R2PS: Worst-Case Robust Real-Time Pursuit Strategies under Partial Observability· slideslive