Robust Reinforcement Learning on State Observations with Learned Optimal   Adversary

Huan Zhang; Hongge Chen; Duane Boning; Cho-Jui Hsieh

arXiv:2101.08452·cs.LG·January 22, 2021·46 cites

Robust Reinforcement Learning on State Observations with Learned Optimal Adversary

Huan Zhang, Hongge Chen, Duane Boning, Cho-Jui Hsieh

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces a new framework called ATLA that trains reinforcement learning agents to be robust against adversarial perturbations in state observations by using learned adversaries and history-based policies, achieving superior performance in continuous control tasks.

Contribution

The paper proposes a novel adversarial attack method and an alternating training framework (ATLA) that enhances RL robustness against adversarial observation perturbations, incorporating history-based policies.

Findings

01

ATLA outperforms previous methods under strong adversaries.

02

Learned adversaries can generate more effective attacks than prior approaches.

03

History-aware policies, like LSTM, improve robustness against adversarial attacks.

Abstract

We study the robustness of reinforcement learning (RL) with adversarially perturbed state observations, which aligns with the setting of many adversarial attacks to deep reinforcement learning (DRL) and is also important for rolling out real-world RL agent under unpredictable sensing noise. With a fixed agent policy, we demonstrate that an optimal adversary to perturb state observations can be found, which is guaranteed to obtain the worst case agent reward. For DRL settings, this leads to a novel empirical adversarial attack to RL agents via a learned adversary that is much stronger than previous ones. To enhance the robustness of an agent, we propose a framework of alternating training with learned adversaries (ATLA), which trains an adversary online together with the agent using policy gradient following the optimal adversarial attack framework. Additionally, inspired by the analysis…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Robust Reinforcement Learning on State Observations with Learned Optimal Adversary· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Anomaly Detection Techniques and Applications

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory