Solving Stabilize-Avoid Optimal Control via Epigraph Form and Deep Reinforcement Learning
Oswin So, Chuchu Fan

TL;DR
This paper introduces a novel deep reinforcement learning approach for solving complex stabilize-avoid optimal control problems in high-dimensional nonlinear systems, improving stability and safety over traditional methods.
Contribution
It transforms the stabilize-avoid problem into an epigraph form and combines it with deep RL, enabling scalable and stable solutions for high-dimensional systems.
Findings
Achieves better stability during training.
Provides larger regions of attraction.
Outperforms existing methods in safety and stability.
Abstract
Tasks for autonomous robotic systems commonly require stabilization to a desired region while maintaining safety specifications. However, solving this multi-objective problem is challenging when the dynamics are nonlinear and high-dimensional, as traditional methods do not scale well and are often limited to specific problem structures. To address this issue, we propose a novel approach to solve the stabilize-avoid problem via the solution of an infinite-horizon constrained optimal control problem (OCP). We transform the constrained OCP into epigraph form and obtain a two-stage optimization problem that optimizes over the policy in the inner problem and over an auxiliary variable in the outer problem. We then propose a new method for this formulation that combines an on-policy deep reinforcement learning algorithm with neural network regression. Our method yields better stability during…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Control Systems Optimization · Reinforcement Learning in Robotics
