Feasible Adversarial Robust Reinforcement Learning for Underspecified   Environments

JB Lanier; Stephen McAleer; Pierre Baldi; Roy Fox

arXiv:2207.09597·cs.LG·October 5, 2022

Feasible Adversarial Robust Reinforcement Learning for Underspecified Environments

JB Lanier, Stephen McAleer, Pierre Baldi, Roy Fox

PDF

Open Access

TL;DR

This paper introduces FARR, a new approach for robust reinforcement learning that automatically identifies feasible environment parameters to improve policy robustness, balancing caution and vulnerability in uncertain real-world settings.

Contribution

FARR formulates environment robustness as a two-player game, enabling automatic determination of feasible parameter sets for more effective robust RL policies.

Findings

01

FARR outperforms existing methods in robustness across tested environments.

02

Approximate Nash equilibria can be efficiently computed with a variation of PSRO.

03

Optimal FARR-trained agents show increased resilience to adversarial environment parameters.

Abstract

Robust reinforcement learning (RL) considers the problem of learning policies that perform well in the worst case among a set of possible environment parameter values. In real-world environments, choosing the set of possible values for robust RL can be a difficult task. When that set is specified too narrowly, the agent will be left vulnerable to reasonable parameter values unaccounted for. When specified too broadly, the agent will be too cautious. In this paper, we propose Feasible Adversarial Robust RL (FARR), a novel problem formulation and objective for automatically determining the set of environment parameter values over which to be robust. FARR implicitly defines the set of feasible parameter values as those on which an agent could achieve a benchmark reward given enough training resources. By formulating this problem as a two-player zero-sum game, optimizing the FARR objective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Advanced Bandit Algorithms Research