Robust Adversarial Reinforcement Learning via Bounded Rationality Curricula
Aryaman Reddi, Maximilian T\"olle, Jan Peters, Georgia Chalvatzaki,, Carlo D'Eramo

TL;DR
This paper introduces Quantal Adversarial RL (QARL), a method that uses entropy regularization and bounded rationality to improve robustness and training efficiency in adversarial reinforcement learning tasks.
Contribution
It proposes a novel entropy-regularized approach that models agents with bounded rationality via Quantal Response Equilibrium, enabling curriculum learning of adversary rationality.
Findings
QARL outperforms RARL and baselines in MuJoCo tasks.
Modulating rationality via temperature improves training stability.
QARL enhances robustness against adversarial attacks.
Abstract
Robustness against adversarial attacks and distribution shifts is a long-standing goal of Reinforcement Learning (RL). To this end, Robust Adversarial Reinforcement Learning (RARL) trains a protagonist against destabilizing forces exercised by an adversary in a competitive zero-sum Markov game, whose optimal solution, i.e., rational strategy, corresponds to a Nash equilibrium. However, finding Nash equilibria requires facing complex saddle point optimization problems, which can be prohibitive to solve, especially for high-dimensional control. In this paper, we propose a novel approach for adversarial RL based on entropy regularization to ease the complexity of the saddle point optimization problem. We show that the solution of this entropy-regularized problem corresponds to a Quantal Response Equilibrium (QRE), a generalization of Nash equilibria that accounts for bounded rationality,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Animal Disease Management and Epidemiology
MethodsEntropy Regularization
