Partially Observable Discrete-time Discounted Markov Games with General   Utility

Arnab Bhabak; Subhamay saha

arXiv:2211.07888·math.OC·November 16, 2022·Oper. Res. Lett.

Partially Observable Discrete-time Discounted Markov Games with General Utility

Arnab Bhabak, Subhamay saha

PDF

Open Access

TL;DR

This paper studies partially observable zero-sum Markov games with general utility functions, proving the existence of values and optimal policies by transforming the problem into a fully observable framework.

Contribution

It introduces a method to convert partially observable Markov games into fully observable ones while accounting for accumulated rewards, establishing foundational results for such games.

Findings

01

Existence of game value for finite and infinite horizons

02

Existence of optimal policies

03

Conversion technique from partial to full observability

Abstract

In this paper, we investigate a partially observable zero sum games where the state process is a discrete time Markov chain. We consider a general utility function in the optimization criterion. We show the existence of value for both finite and infinite horizon games and also establish the existence of optimal polices. The main step involves converting the partially observable game into a completely observable game which also keeps track of the total discounted accumulated reward/cost.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGame Theory and Applications · Economic theories and models