Learning Robust Policies via Interpretable Hamilton-Jacobi Reachability-Guided Disturbances
Hanyang Hu, Xilun Zhang, Xubo Lyu, Mo Chen

TL;DR
This paper introduces a novel robust policy training method combining model-based control with adversarial RL, using Hamilton-Jacobi reachability to generate interpretable worst-case disturbances, improving robustness in robotics tasks.
Contribution
It presents a new Hamilton-Jacobi reachability-guided disturbance approach for adversarial RL, enhancing robustness without external black-box adversaries.
Findings
Effective in reach-avoid game in simulation and real-world
Achieves robust quadrotor stabilization in dynamic environments
Critic network aligns with ground-truth Hamilton-Jacobi value function
Abstract
Deep Reinforcement Learning (RL) has shown remarkable success in robotics with complex and heterogeneous dynamics. However, its vulnerability to unknown disturbances and adversarial attacks remains a significant challenge. In this paper, we propose a robust policy training framework that integrates model-based control principles with adversarial RL training to improve robustness without the need for external black-box adversaries. Our approach introduces a novel Hamilton-Jacobi reachability-guided disturbance for adversarial RL training, where we use interpretable worst-case or near-worst-case disturbances as adversaries against the robust policy. We evaluated its effectiveness across three distinct tasks: a reach-avoid game in both simulation and real-world settings, and a highly dynamic quadrotor stabilization task in simulation. We validate that our learned critic network is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · Reinforcement Learning in Robotics
