Towards Optimal Adversarial Robust Q-learning with Bellman   Infinity-error

Haoran Li; Zicheng Zhang; Wang Luo; Congying Han; Yudong Hu; Tiande; Guo; Shichen Liao

arXiv:2402.02165·cs.LG·June 24, 2024·1 cites

Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error

Haoran Li, Zicheng Zhang, Wang Luo, Congying Han, Yudong Hu, Tiande, Guo, Shichen Liao

PDF

Open Access 2 Repos

TL;DR

This paper investigates the existence of an optimal robust policy in deep reinforcement learning under adversarial conditions, proving its deterministic nature and introducing a new training method that minimizes Bellman Infinity-error for improved robustness.

Contribution

It establishes the existence of a deterministic, stationary optimal robust policy under a consistency assumption and proposes CAR-DQN, a new algorithm trained to minimize Bellman Infinity-error for enhanced adversarial robustness.

Findings

01

CAR-DQN outperforms existing methods on benchmarks.

02

Minimizing Bellman Infinity-error is crucial for robustness.

03

Theoretical proof of the existence of a deterministic ORP.

Abstract

Establishing robust policies is essential to counter attacks or disturbances affecting deep reinforcement learning (DRL) agents. Recent studies explore state-adversarial robustness and suggest the potential lack of an optimal robust policy (ORP), posing challenges in setting strict robustness constraints. This work further investigates ORP: At first, we introduce a consistency assumption of policy (CAP) stating that optimal actions in the Markov decision process remain consistent with minor perturbations, supported by empirical and theoretical evidence. Building upon CAP, we crucially prove the existence of a deterministic and stationary ORP that aligns with the Bellman optimal policy. Furthermore, we illustrate the necessity of $L^{\infty}$ -norm when minimizing Bellman error to attain ORP. This finding clarifies the vulnerability of prior DRL algorithms that target the Bellman optimal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning