Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Angelos Filos; Eszter V\'ertes; Zita Marinho; Gregory Farquhar; Diana; Borsa; Abram Friesen; Feryal Behbahani; Tom Schaul; Andr\'e Barreto; Simon; Osindero

arXiv:2112.04153·cs.LG·July 1, 2022

Model-Value Inconsistency as a Signal for Epistemic Uncertainty

Angelos Filos, Eszter V\'ertes, Zita Marinho, Gregory Farquhar, Diana, Borsa, Abram Friesen, Feryal Behbahani, Tom Schaul, Andr\'e Barreto, Simon, Osindero

PDF

Open Access

TL;DR

This paper introduces model-value inconsistency as a simple, effective proxy for epistemic uncertainty in model-based reinforcement learning, leveraging existing models and value functions to improve exploration, safety, and robustness.

Contribution

It proposes the implicit value ensemble (IVE) concept and demonstrates that value estimate discrepancies serve as a practical uncertainty signal without additional model training.

Findings

01

Self-inconsistency improves exploration efficiency.

02

It enhances safety under distribution shifts.

03

Robustifies value-based planning with a learned model.

Abstract

Using a model of the environment and a value function, an agent can construct many estimates of a state's value, by unrolling the model for different lengths and bootstrapping with its value function. Our key insight is that one can treat this set of value estimates as a type of ensemble, which we call an \emph{implicit value ensemble} (IVE). Consequently, the discrepancy between these estimates can be used as a proxy for the agent's epistemic uncertainty; we term this signal \emph{model-value inconsistency} or \emph{self-inconsistency} for short. Unlike prior work which estimates uncertainty by training an ensemble of many models and/or value functions, this approach requires only the single model and value function which are already being learned in most model-based reinforcement learning algorithms. We provide empirical evidence in both tabular and function approximation settings…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Machine Learning and Algorithms