TL;DR
This paper introduces a generic, reliable, and efficient framework for solving robust Markov decision processes that can handle various uncertainty sets and objectives, providing guarantees and outperforming existing methods.
Contribution
The authors develop a unified framework for robust MDPs that is broadly applicable, guarantees convergence and precision, and significantly improves computational efficiency.
Findings
Framework handles diverse uncertainty sets including intervals and polytopes.
Provides convergence guarantees and precision at any computation stage.
Solves large-scale RMDPs with over a million states in under a minute.
Abstract
Markov decision processes (MDP) are a well-established model for sequential decision-making in the presence of probabilities. In robust MDP (RMDP), every action is associated with an uncertainty set of probability distributions, modelling that transition probabilities are not known precisely. Based on the known theoretical connection to stochastic games, we provide a framework for solving RMDPs that is generic, reliable, and efficient. It is *generic* both with respect to the model, allowing for a wide range of uncertainty sets, including but not limited to intervals, - or -balls, and polytopes; and with respect to the objective, including long-run average reward, undiscounted total reward, and stochastic shortest path. It is *reliable*, as our approach not only converges in the limit, but provides precision guarantees at any time during the computation. It is *efficient*…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
MethodsSparse Evolutionary Training
