Rigorous Agent Evaluation: An Adversarial Approach to Uncover   Catastrophic Failures

Jonathan Uesato; Ananya Kumar; Csaba Szepesvari; Tom Erez; Avraham; Ruderman; Keith Anderson; Krishmamurthy (Dj) Dvijotham; Nicolas Heess,; Pushmeet Kohli

arXiv:1812.01647·cs.LG·December 6, 2018·46 cites

Rigorous Agent Evaluation: An Adversarial Approach to Uncover Catastrophic Failures

Jonathan Uesato, Ananya Kumar, Csaba Szepesvari, Tom Erez, Avraham, Ruderman, Keith Anderson, Krishmamurthy (Dj) Dvijotham, Nicolas Heess,, Pushmeet Kohli

PDF

Open Access

TL;DR

This paper introduces an adversarial evaluation method for learning agents in safety-critical domains, effectively identifying catastrophic failures and accurately estimating failure probabilities much faster than traditional methods.

Contribution

It proposes an adversarial evaluation approach that uncovers rare failure scenarios and provides unbiased failure probability estimates, improving safety assessments of learned agents.

Findings

01

Successfully finds catastrophic failures in humanoid control and driving domains.

02

Estimates failure rates orders of magnitude faster than standard methods.

03

Reuses existing training data for efficient evaluation.

Abstract

This paper addresses the problem of evaluating learning systems in safety critical domains such as autonomous driving, where failures can have catastrophic consequences. We focus on two problems: searching for scenarios when learned agents fail and assessing their probability of failure. The standard method for agent evaluation in reinforcement learning, Vanilla Monte Carlo, can miss failures entirely, leading to the deployment of unsafe agents. We demonstrate this is an issue for current agents, where even matching the compute used for training is sometimes insufficient for evaluation. To address this shortcoming, we draw upon the rare event probability estimation literature and propose an adversarial evaluation approach. Our approach focuses evaluation on adversarially chosen situations, while still providing unbiased estimates of failure probabilities. The key difficulty is in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Reinforcement Learning in Robotics