Linear Mixture Distributionally Robust Markov Decision Processes

Zhishuai Liu; Pan Xu

arXiv:2505.18044·cs.LG·May 26, 2025

Linear Mixture Distributionally Robust Markov Decision Processes

Zhishuai Liu, Pan Xu

PDF

TL;DR

This paper introduces a linear mixture distributionally robust Markov decision process framework that refines uncertainty modeling and provides theoretical guarantees for robust policy learning under various divergence measures.

Contribution

It proposes a novel linear mixture DRMDP framework with refined uncertainty sets and a meta algorithm for robust policy learning, supported by sample complexity analysis.

Findings

01

More refined uncertainty representation compared to traditional models.

02

Sample complexity bounds established for multiple divergence measures.

03

Theoretical foundation for future research in robust MDPs.

Abstract

Many real-world decision-making problems face the off-dynamics challenge: the agent learns a policy in a source domain and deploys it in a target domain with different state transitions. The distributionally robust Markov decision process (DRMDP) addresses this challenge by finding a robust policy that performs well under the worst-case environment within a pre-specified uncertainty set of transition dynamics. Its effectiveness heavily hinges on the proper design of these uncertainty sets, based on prior knowledge of the dynamics. In this work, we propose a novel linear mixture DRMDP framework, where the nominal dynamics is assumed to be a linear mixture model. In contrast with existing uncertainty sets directly defined as a ball centered around the nominal kernel, linear mixture DRMDPs define the uncertainty sets based on a ball around the mixture weighting parameter. We show that this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSparse Evolutionary Training