Solving Reach-Avoid-Stay Problems Using Deep Deterministic Policy Gradients
Gabriel Chenevert, Jingqi Li, Achyuta kannan, Sangjae Bae, Donggun Lee

TL;DR
This paper introduces a deep reinforcement learning approach using DDPG to solve reach-avoid-stay problems, enabling systems to reach targets, avoid obstacles, and stay safe in complex, high-dimensional environments.
Contribution
It extends RL-based reachability analysis to RAS problems by proposing a two-step DDPG method that finds the maximal robust RAS set, handling complex environments and high-dimensional systems.
Findings
Achieves higher success rates than previous methods.
Enables RAS in complex, high-dimensional environments.
Validated through simulation and high-dimensional experiments.
Abstract
Reach-Avoid-Stay (RAS) optimal control enables systems such as robots and air taxis to reach their targets, avoid obstacles, and stay near the target. However, current methods for RAS often struggle with handling complex, dynamic environments and scaling to high-dimensional systems. While reinforcement learning (RL)-based reachability analysis addresses these challenges, it has yet to tackle the RAS problem. In this paper, we propose a two-step deep deterministic policy gradient (DDPG) method to extend RL-based reachability method to solve RAS problems. First, we train a function that characterizes the maximal robust control invariant set within the target set, where the system can safely stay, along with its corresponding policy. Second, we train a function that defines the set of states capable of safely reaching the robust control invariant set, along with its corresponding policy.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Search Problems · Robotic Path Planning Algorithms
MethodsSparse Evolutionary Training
