Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm

Yang Xu; Swetha Ganesh; Washim Uddin Mondal; Qinbo Bai; and Vaneet Aggarwal

arXiv:2505.15138·cs.LG·December 11, 2025

Global Convergence for Average Reward Constrained MDPs with Primal-Dual Actor Critic Algorithm

Yang Xu, Swetha Ganesh, Washim Uddin Mondal, Qinbo Bai, and Vaneet Aggarwal

PDF

Open Access

TL;DR

This paper introduces a primal-dual actor-critic algorithm for average reward constrained MDPs that guarantees global convergence and near-optimal constraint violation rates, advancing theoretical understanding in this domain.

Contribution

It presents a novel primal-dual natural actor-critic algorithm with proven global convergence and rate guarantees for average reward CMDPs, even without knowledge of mixing time.

Findings

01

Achieves $ ilde{O}(1/ oot{T}{})$ convergence rate with known mixing time

02

Maintains near-optimal rates without mixing time knowledge under certain conditions

03

Establishes new theoretical benchmarks matching lower bounds for average reward CMDPs

Abstract

This paper investigates infinite-horizon average reward Constrained Markov Decision Processes (CMDPs) with general parametrization. We propose a Primal-Dual Natural Actor-Critic algorithm that adeptly manages constraints while ensuring a high convergence rate. In particular, our algorithm achieves global convergence and constraint violation rates of $\tilde{O} (1/ T)$ over a horizon of length $T$ when the mixing time, $τ_{mix}$ , is known to the learner. In absence of knowledge of $τ_{mix}$ , the achievable rates change to $\tilde{O} (1/ T^{0.5 - ϵ})$ provided that $T \geq \tilde{O} (τ_{mix}^{2/ ϵ})$ . Our results match the theoretical lower bound for Markov Decision Processes and establish a new benchmark in the theoretical exploration of average reward CMDPs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications