Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments
JB Lanier, Stephen McAleer, Pierre Baldi, Roy Fox

TL;DR
This paper introduces FARR, a new approach for robust reinforcement learning that automatically identifies feasible environment parameters to improve policy robustness, balancing caution and vulnerability in uncertain real-world settings.
Contribution
FARR formulates environment robustness as a two-player game, enabling automatic determination of feasible parameter sets for more effective robust RL policies.
Findings
FARR outperforms existing methods in robustness across tested environments.
Approximate Nash equilibria can be efficiently computed with a variation of PSRO.
Optimal FARR-trained agents show increased resilience to adversarial environment parameters.
Abstract
Robust reinforcement learning (RL) considers the problem of learning policies that perform well in the worst case among a set of possible environment parameter values. In real-world environments, choosing the set of possible values for robust RL can be a difficult task. When that set is specified too narrowly, the agent will be left vulnerable to reasonable parameter values unaccounted for. When specified too broadly, the agent will be too cautious. In this paper, we propose Feasible Adversarial Robust RL (FARR), a novel problem formulation and objective for automatically determining the set of environment parameter values over which to be robust. FARR implicitly defines the set of feasible parameter values as those on which an agent could achieve a benchmark reward given enough training resources. By formulating this problem as a two-player zero-sum game, optimizing the FARR objective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research
