Uncertainty-Based Out-of-Distribution Classification in Deep   Reinforcement Learning

Andreas Sedlmeier; Thomas Gabor; Thomy Phan; Lenz Belzner; Claudia; Linnhoff-Popien

arXiv:2001.00496·cs.LG·April 17, 2020

Uncertainty-Based Out-of-Distribution Classification in Deep Reinforcement Learning

Andreas Sedlmeier, Thomas Gabor, Thomy Phan, Lenz Belzner, Claudia, Linnhoff-Popien

PDF

TL;DR

This paper introduces UBOOD, a framework for detecting out-of-distribution states in deep reinforcement learning by leveraging epistemic uncertainty, with a focus on ensemble methods for reliable OOD classification.

Contribution

The paper proposes a novel uncertainty-based OOD detection framework for deep RL, utilizing epistemic uncertainty and dynamic thresholds, compatible with various uncertainty estimation techniques.

Findings

01

Ensemble-based estimators reliably detect OOD states.

02

Dropout-based estimators struggle with OOD detection.

03

Dynamic thresholding improves OOD classification accuracy.

Abstract

Robustness to out-of-distribution (OOD) data is an important goal in building reliable machine learning systems. Especially in autonomous systems, wrong predictions for OOD inputs can cause safety critical situations. As a first step towards a solution, we consider the problem of detecting such data in a value-based deep reinforcement learning (RL) setting. Modelling this problem as a one-class classification problem, we propose a framework for uncertainty-based OOD classification: UBOOD. It is based on the effect that an agent's epistemic uncertainty is reduced for situations encountered during training (in-distribution), and thus lower than for unencountered (OOD) situations. Being agnostic towards the approach used for estimating epistemic uncertainty, combinations with different uncertainty estimation methods, e.g. approximate Bayesian inference methods or ensembling techniques are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.