An Offline Risk-aware Policy Selection Method for Bayesian Markov   Decision Processes

Giorgio Angelotti; Nicolas Drougard; Caroline Ponzoni Carvalho Chanel

arXiv:2105.13431·cs.LG·April 12, 2023

An Offline Risk-aware Policy Selection Method for Bayesian Markov Decision Processes

Giorgio Angelotti, Nicolas Drougard, Caroline Ponzoni Carvalho Chanel

PDF

Open Access 1 Repo

TL;DR

This paper introduces EvC, a Bayesian-based offline policy selection method that balances risk and robustness, effectively choosing policies that perform reliably in real-world applications despite limited data.

Contribution

The paper proposes a novel risk-aware policy selection framework, EvC, that incorporates Bayesian uncertainty to improve robustness in offline MDP planning and reinforcement learning.

Findings

01

EvC effectively selects robust policies in simple discrete environments.

02

EvC outperforms state-of-the-art approaches in terms of robustness.

03

The method is suitable for offline applications prioritizing safety and reliability.

Abstract

In Offline Model Learning for Planning and in Offline Reinforcement Learning, the limited data set hinders the estimate of the Value function of the relative Markov Decision Process (MDP). Consequently, the performance of the obtained policy in the real world is bounded and possibly risky, especially when the deployment of a wrong policy can lead to catastrophic consequences. For this reason, several pathways are being followed with the scope of reducing the model error (or the distributional shift between the learned model and the true one) and, more broadly, obtaining risk-aware solutions with respect to model uncertainty. But when it comes to the final application which baseline should a practitioner choose? In an offline context where computational time is not an issue and robustness is the priority we propose Exploitation vs Caution (EvC), a paradigm that (1) elegantly incorporates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

giorgioangel/evc
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Bayesian Modeling and Causal Inference