Model-Based Uncertainty in Value Functions
Carlos E. Luis, Alessandro G. Bottero, Julia Vinogradska, Felix, Berkenkamp, Jan Peters

TL;DR
This paper introduces a new uncertainty Bellman equation for better quantification of value function uncertainty in model-based reinforcement learning, leading to improved exploration and sample efficiency.
Contribution
It presents a novel uncertainty Bellman equation that accurately estimates posterior variance over values, enhancing exploration strategies in RL.
Findings
Sharper uncertainty estimates improve sample efficiency.
Method scales to deep RL architectures.
Enhanced exploration in complex tasks.
Abstract
We consider the problem of quantifying uncertainty over expected cumulative rewards in model-based reinforcement learning. In particular, we focus on characterizing the variance over values induced by a distribution over MDPs. Previous work upper bounds the posterior variance over values by solving a so-called uncertainty Bellman equation, but the over-approximation may result in inefficient exploration. We propose a new uncertainty Bellman equation whose solution converges to the true posterior variance over values and explicitly characterizes the gap in previous work. Moreover, our uncertainty quantification technique is easily integrated into common exploration strategies and scales naturally beyond the tabular setting by using standard deep reinforcement learning architectures. Experiments in difficult exploration tasks, both in tabular and continuous control settings, show that our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
