Projection-based Lyapunov method for fully heterogeneous weakly-coupled MDPs
Xiangcheng Zhang, Yige Hong, Weina Wang

TL;DR
This paper introduces a projection-based Lyapunov method to address the challenge of heterogeneity in weakly-coupled Markov decision processes, achieving asymptotic optimality as the number of arms grows large.
Contribution
It presents the first asymptotic optimality result for fully heterogeneous WCMDPs using a novel Lyapunov function construction.
Findings
Achieves an $O(1/ oot N)$ optimality gap for large $N$
First asymptotic optimality result for fully heterogeneous WCMDPs
Introduces a projection-based Lyapunov approach for convergence certification
Abstract
Heterogeneity poses a fundamental challenge for many real-world large-scale decision-making problems but remains largely understudied. In this paper, we study the fully heterogeneous setting of a prominent class of such problems, known as weakly-coupled Markov decision processes (WCMDPs). Each WCMDP consists of arms (or subproblems), which have distinct model parameters in the fully heterogeneous setting, leading to the curse of dimensionality when is large. We show that, under mild assumptions, an efficiently computable policy achieves an optimality gap in the long-run average reward per arm for fully heterogeneous WCMDPs as becomes large. This is the first asymptotic optimality result for fully heterogeneous average-reward WCMDPs. Our main technical innovation is the construction of projection-based Lyapunov functions that certify the convergence of rewards…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAuction Theory and Applications
