Robust Deep Reinforcement Learning with Adaptive Adversarial Perturbations in Action Space
Qianmei Liu, Yufei Kuang, Jie Wang

TL;DR
This paper introduces an adaptive adversarial perturbation method for deep reinforcement learning that dynamically adjusts perturbation levels during training, enhancing robustness and stability without needing prior simulator access.
Contribution
The paper proposes a novel adaptive adversarial perturbation framework (A2P) that automatically tunes perturbation strength during training to improve robustness and stability in DRL.
Findings
A2P improves training stability in MuJoCo environments.
A2P enhances policy robustness across different test environments.
The method does not require prior simulator access.
Abstract
Deep reinforcement learning (DRL) algorithms can suffer from modeling errors between the simulation and the real world. Many studies use adversarial learning to generate perturbation during training process to model the discrepancy and improve the robustness of DRL. However, most of these approaches use a fixed parameter to control the intensity of the adversarial perturbation, which can lead to a trade-off between average performance and robustness. In fact, finding the optimal parameter of the perturbation is challenging, as excessive perturbations may destabilize training and compromise agent performance, while insufficient perturbations may not impart enough information to enhance robustness. To keep the training stable while improving robustness, we propose a simple but effective method, namely, Adaptive Adversarial Perturbation (A2P), which can dynamically select appropriate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
