Infinite Horizon Average Cost Dynamic Programming Subject to Total   Variation Distance Ambiguity

Ioannis Tzortzis; Charalambos D. Charalambous; Themistoklis; Charalambous

arXiv:1512.06510·math.OC·December 22, 2015·SIAM J. Control. Optim.

Infinite Horizon Average Cost Dynamic Programming Subject to Total Variation Distance Ambiguity

Ioannis Tzortzis, Charalambos D. Charalambous, Themistoklis, Charalambous

PDF

TL;DR

This paper develops new dynamic programming equations and policy iteration algorithms for infinite horizon average cost Markov control models under total variation distance ambiguity, with applications demonstrated through examples.

Contribution

It introduces generalized dynamic programming equations and policy iteration algorithms accounting for total variation ambiguity in controlled Markov processes.

Findings

01

New policy iteration algorithms using water filling solutions.

02

Conditions for irreducibility of maximizing distributions.

03

Application to finite and Borel space models.

Abstract

We analyze the infinite horizon minimax average cost Markov Control Model (MCM), for a class of controlled process conditional distributions, which belong to a ball, with respect to total variation distance metric, centered at a known nominal controlled conditional distribution with radius $R \in [0, 2]$ , in which the minimization is over the control strategies and the maximization is over conditional distributions. Upon performing the maximization, a dynamic programming equation is obtained which includes, in addition to the standard terms, the oscillator semi-norm of the cost-to-go. First, the dynamic programming equation is analyzed for finite state and control spaces. We show that if the nominal controlled process distribution is irreducible, then for every stationary Markov control policy the maximizing conditional distribution of the controlled process is also irreducible for $R \in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.