Robust Deterministic Policies for Markov Decision Processes under   Budgeted Uncertainty

Fei Wu; Erik Demeulemeester; Jannik Matuschke

arXiv:2412.12879·math.OC·December 18, 2024

Robust Deterministic Policies for Markov Decision Processes under Budgeted Uncertainty

Fei Wu, Erik Demeulemeester, Jannik Matuschke

PDF

Open Access

TL;DR

This paper investigates the complexity of computing robust deterministic policies for Markov Decision Processes under budgeted uncertainty, revealing NP-hardness and hardness of approximation, and proposing approximation algorithms for special cases.

Contribution

It demonstrates the NP-hardness of finding optimal deterministic policies in the LDST model and introduces approximation algorithms for specific cases.

Findings

01

Optimal randomized policies are efficiently computable when only rewards are uncertain.

02

Computing optimal deterministic policies is NP-hard even in simple cases.

03

Provides approximation algorithms and hardness results for the general problem.

Abstract

This paper studies the computation of robust deterministic policies for Markov Decision Processes (MDPs) in the Lightning Does Not Strike Twice (LDST) model of Mannor, Mebel and Xu (ICML '12). In this model, designed to provide robustness in the face of uncertain input data while not being overly conservative, transition probabilities and rewards are uncertain and the uncertainty set is constrained by a budget that limits the number of states whose parameters can deviate from their nominal values. Mannor et al. (ICML '12) showed that optimal randomized policies for MDPs in the LDST regime can be efficiently computed when only the rewards are affected by uncertainty. In contrast to these findings, we observe that the computation of optimal deterministic policies is $N P$ -hard even when only a single terminal reward may deviate from its nominal value and the MDP consists of $2$ time…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForecasting Techniques and Applications