Reinforcement learning in pursuit-evasion differential game: safety, stability and robustness
Xinyang Wang, Hongwei Zhang, Jun Xu, Shimin Wang, Martin Guay

TL;DR
This paper develops a hierarchical reinforcement learning framework for pursuit-evasion problems that ensures safety, stability, and robustness against disturbances by integrating control barrier functions, sliding mode control, and game-theoretic strategies.
Contribution
It introduces a novel hierarchical RL approach inspired by Stackelberg games to handle safety and robustness in pursuit-evasion scenarios with disturbances.
Findings
Framework effectively maintains safety and stability in simulations.
Proposed RL method demonstrates robustness against disturbances.
Hierarchical design improves safety enforcement without sacrificing stability.
Abstract
Safety and stability are two critical concerns in pursuit-evasion (PE) problems in an obstacle-rich environment. Most existing works combine control barrier functions (CBFs) and reinforcement learning (RL) to provide an efficient and safe solution. However, they do not consider the presence of disturbances, such as wind gust and actuator fault, which may exist in many practical applications. This paper integrates CBFs and a sliding mode control (SMC) term into RL to simultaneously address safety, stability, and robustness to disturbances. However, this integration is significantly challenging due to the strong coupling between the CBF and SMC terms. Inspired by Stackelberg game, we handle the coupling issue by proposing a hierarchical design scheme where SMC and safe control terms interact with each other in a leader-follower manner. Specifically, the CBF controller, acting as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGuidance and Control Systems · Adaptive Dynamic Programming Control · Extremum Seeking Control Systems
