Solving Robust Markov Decision Processes: Generic, Reliable, Efficient

Tobias Meggendorfer; Maximilian Weininger; Patrick Wienh\"oft

arXiv:2412.10185·cs.AI·December 16, 2024

Solving Robust Markov Decision Processes: Generic, Reliable, Efficient

Tobias Meggendorfer, Maximilian Weininger, Patrick Wienh\"oft

PDF

1 Video

TL;DR

This paper introduces a generic, reliable, and efficient framework for solving robust Markov decision processes that can handle various uncertainty sets and objectives, providing guarantees and outperforming existing methods.

Contribution

The authors develop a unified framework for robust MDPs that is broadly applicable, guarantees convergence and precision, and significantly improves computational efficiency.

Findings

01

Framework handles diverse uncertainty sets including intervals and polytopes.

02

Provides convergence guarantees and precision at any computation stage.

03

Solves large-scale RMDPs with over a million states in under a minute.

Abstract

Markov decision processes (MDP) are a well-established model for sequential decision-making in the presence of probabilities. In robust MDP (RMDP), every action is associated with an uncertainty set of probability distributions, modelling that transition probabilities are not known precisely. Based on the known theoretical connection to stochastic games, we provide a framework for solving RMDPs that is generic, reliable, and efficient. It is *generic* both with respect to the model, allowing for a wide range of uncertainty sets, including but not limited to intervals, $L^{1}$ - or $L^{2}$ -balls, and polytopes; and with respect to the objective, including long-run average reward, undiscounted total reward, and stochastic shortest path. It is *reliable*, as our approach not only converges in the limit, but provides precision guarantees at any time during the computation. It is *efficient*…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Solving Robust Markov Decision Processes: Generic, Reliable, Efficient· underline

Taxonomy

MethodsSparse Evolutionary Training