Adversary Agnostic Robust Deep Reinforcement Learning
Xinghua Qu, Yew-Soon Ong, Abhishek Gupta, Zhu Sun

TL;DR
This paper introduces an adversary agnostic approach to enhance the robustness of deep reinforcement learning policies without relying on adversarial training, using a novel policy distillation method grounded in theoretical guarantees.
Contribution
The paper proposes a new adversary agnostic robust DRL paradigm based on policy distillation, with a novel loss function and theoretical analysis ensuring increased robustness without adversary knowledge.
Findings
Outperforms state-of-the-art methods in Atari games
Theoretically guarantees increased robustness
Effective against various perturbations
Abstract
Deep reinforcement learning (DRL) policies have been shown to be deceived by perturbations (e.g., random noise or intensional adversarial attacks) on state observations that appear at test time but are unknown during training. To increase the robustness of DRL policies, previous approaches assume that the knowledge of adversaries can be added into the training process to achieve the corresponding generalization ability on these perturbed observations. However, such an assumption not only makes the robustness improvement more expensive but may also leave a model less effective to other kinds of attacks in the wild. In contrast, we propose an adversary agnostic robust DRL paradigm that does not require learning from adversaries. To this end, we first theoretically derive that robustness could indeed be achieved independently of the adversaries based on a policy distillation setting.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Advanced Malware Detection Techniques
