Robust $Q$-learning Algorithm for Markov Decision Processes under   Wasserstein Uncertainty

Ariel Neufeld; Julian Sester

arXiv:2210.00898·cs.LG·June 21, 2024

Robust $Q$-learning Algorithm for Markov Decision Processes under Wasserstein Uncertainty

Ariel Neufeld, Julian Sester

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new $Q$-learning algorithm designed for distributionally robust Markov decision processes with Wasserstein ambiguity sets, demonstrating convergence and practical benefits through real data examples.

Contribution

The paper develops a novel $Q$-learning method for Wasserstein distributional robustness in MDPs, with proven convergence and demonstrated practical advantages.

Findings

01

Algorithm converges reliably in tested scenarios.

02

Distributional robustness improves decision quality under model misspecification.

03

Real data examples show the algorithm's tractability and robustness benefits.

Abstract

We present a novel $Q$ -learning algorithm tailored to solve distributionally robust Markov decision problems where the corresponding ambiguity set of transition probabilities for the underlying Markov decision process is a Wasserstein ball around a (possibly estimated) reference measure. We prove convergence of the presented algorithm and provide several examples also using real data to illustrate both the tractability of our algorithm as well as the benefits of considering distributional robustness when solving stochastic optimal control problems, in particular when the estimated distributions turn out to be misspecified in practice.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

juliansester/wasserstein-q-learning
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization