Towards Optimal Adversarial Robust Q-learning with Bellman Infinity-error
Haoran Li, Zicheng Zhang, Wang Luo, Congying Han, Yudong Hu, Tiande, Guo, Shichen Liao

TL;DR
This paper investigates the existence of an optimal robust policy in deep reinforcement learning under adversarial conditions, proving its deterministic nature and introducing a new training method that minimizes Bellman Infinity-error for improved robustness.
Contribution
It establishes the existence of a deterministic, stationary optimal robust policy under a consistency assumption and proposes CAR-DQN, a new algorithm trained to minimize Bellman Infinity-error for enhanced adversarial robustness.
Findings
CAR-DQN outperforms existing methods on benchmarks.
Minimizing Bellman Infinity-error is crucial for robustness.
Theoretical proof of the existence of a deterministic ORP.
Abstract
Establishing robust policies is essential to counter attacks or disturbances affecting deep reinforcement learning (DRL) agents. Recent studies explore state-adversarial robustness and suggest the potential lack of an optimal robust policy (ORP), posing challenges in setting strict robustness constraints. This work further investigates ORP: At first, we introduce a consistency assumption of policy (CAP) stating that optimal actions in the Markov decision process remain consistent with minor perturbations, supported by empirical and theoretical evidence. Building upon CAP, we crucially prove the existence of a deterministic and stationary ORP that aligns with the Bellman optimal policy. Furthermore, we illustrate the necessity of -norm when minimizing Bellman error to attain ORP. This finding clarifies the vulnerability of prior DRL algorithms that target the Bellman optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning
