Offline Bayesian Aleatoric and Epistemic Uncertainty Quantification and   Posterior Value Optimisation in Finite-State MDPs

Filippo Valdettaro; A. Aldo Faisal

arXiv:2406.02456·cs.LG·June 5, 2024

Offline Bayesian Aleatoric and Epistemic Uncertainty Quantification and Posterior Value Optimisation in Finite-State MDPs

Filippo Valdettaro, A. Aldo Faisal

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Bayesian approach to quantify and disentangle aleatoric and epistemic uncertainties in finite-state MDPs, and proposes a method to optimize policies based on posterior expected value, demonstrated on gridworlds and the AI Clinician problem.

Contribution

It presents a novel technique for uncertainty quantification and policy optimization in offline finite-state MDPs without strong distributional assumptions.

Findings

01

Effective uncertainty disentanglement in simple MDPs

02

Successful policy optimization using closed-form value expressions

03

Scalable approach demonstrated on real-world ICU treatment data

Abstract

We address the challenge of quantifying Bayesian uncertainty and incorporating it in offline use cases of finite-state Markov Decision Processes (MDPs) with unknown dynamics. Our approach provides a principled method to disentangle epistemic and aleatoric uncertainty, and a novel technique to find policies that optimise Bayesian posterior expected value without relying on strong assumptions about the MDP's posterior distribution. First, we utilise standard Bayesian reinforcement learning methods to capture the posterior uncertainty in MDP parameters based on available data. We then analytically compute the first two moments of the return distribution across posterior samples and apply the law of total variance to disentangle aleatoric and epistemic uncertainties. To find policies that maximise posterior expected value, we leverage the closed-form expression for value as a function of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

filippovaldettaro/finite-state-mdps
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical Methods and Algorithms · Formal Methods in Verification · Low-power high-performance VLSI design