Zero-Sum Semi-Markov Games with State-Action-Dependent Discount Factors

Zhihui Yu; Xianping Guo; Li Xia

arXiv:2103.04084·cs.GT·March 9, 2021

Zero-Sum Semi-Markov Games with State-Action-Dependent Discount Factors

Zhihui Yu, Xianping Guo, Li Xia

PDF

Open Access

TL;DR

This paper studies two-player zero-sum semi-Markov games with state-action-dependent discounting, establishing the existence of value functions and optimal strategies, and providing algorithms for finite cases with convergence guarantees.

Contribution

It introduces a general semi-Markov game model with state-action-dependent discount factors and proves the existence of solutions under regularity conditions, also developing a value iteration algorithm for finite cases.

Findings

01

Existence of value functions and optimal strategies under certain conditions.

02

Development of a convergent value iteration algorithm for finite state-action spaces.

03

Numerical examples demonstrating theoretical results.

Abstract

Semi-Markov model is one of the most general models for stochastic dynamic systems. This paper deals with a two-person zero-sum game for semi-Markov processes. We focus on the expected discounted payoff criterion with state-action-dependent discount factors. The state and action spaces are both Polish spaces, and the payoff function is $ω$ -bounded. We first construct a fairly general model of semi-Markov games under a given semi-Markov kernel and a pair of strategies. Next, based on the standard regularity condition and the continuity-compactness condition for semi-Markov games, we derive a "drift condition" on the semi-Markov kernel and suppose that the discount factors have a positive lower bound, under which the existence of the value function and a pair of optimal stationary strategies of our semi-Markov game are proved by using the Shapley equation. Moreover, when the state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Reinforcement Learning in Robotics · Economic theories and models