Partially Observable Discrete-time Discounted Markov Games with General Utility
Arnab Bhabak, Subhamay saha

TL;DR
This paper studies partially observable zero-sum Markov games with general utility functions, proving the existence of values and optimal policies by transforming the problem into a fully observable framework.
Contribution
It introduces a method to convert partially observable Markov games into fully observable ones while accounting for accumulated rewards, establishing foundational results for such games.
Findings
Existence of game value for finite and infinite horizons
Existence of optimal policies
Conversion technique from partial to full observability
Abstract
In this paper, we investigate a partially observable zero sum games where the state process is a discrete time Markov chain. We consider a general utility function in the optimization criterion. We show the existence of value for both finite and infinite horizon games and also establish the existence of optimal polices. The main step involves converting the partially observable game into a completely observable game which also keeps track of the total discounted accumulated reward/cost.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Economic theories and models
