Computational Hardness of Static Distributionally Robust Markov Decision Processes

Yan Li

arXiv:2511.02224·math.OC·May 8, 2026

Computational Hardness of Static Distributionally Robust Markov Decision Processes

Yan Li

PDF

TL;DR

This paper demonstrates the computational difficulty of finding optimal policies in static distributionally robust Markov decision processes, proving NP-hardness under various policy classes.

Contribution

It establishes NP-hardness results for static DRMDPs with simple ambiguity sets, highlighting fundamental computational challenges.

Findings

01

Optimal policy computation is NP-hard for non-randomized Markovian policies.

02

NP-hardness persists even with only two transition kernels in the ambiguity set.

03

The robust value function can have sub-optimal strict local minimizers.

Abstract

We present some hardness results on finding the optimal policy for the static formulation of distributionally robust Markov decision processes. We construct problem instances such that when the considered policy class is Markovian and non-randomized, finding the optimal policy is NP-hard. When the considered policy class is Markovian and randomized, the robust value function possesses sub-optimal strict local minimizers, and finding the optimal policy is also NP-hard. The considered instances involve an ambiguity set with only two transition kernels.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.