Discounting in Games across Time Scales

Krishnendu Chatterjee (IST Austria); Rupak Majumdar (UCLA)

arXiv:1006.1403·cs.LO·June 9, 2010·GANDALF

Discounting in Games across Time Scales

Krishnendu Chatterjee (IST Austria), Rupak Majumdar (UCLA)

PDF

TL;DR

This paper introduces two-level discounted games combining discounted and undiscounted reachability games, providing theoretical insights, strategy existence proofs, and polynomial-time value computation for single-player cases.

Contribution

It presents a novel two-level game model, proves the existence of optimal strategies, and offers algorithms for value computation and decision problems.

Findings

01

Pure memoryless strategies exist for both players.

02

Values can be computed in polynomial time for single-player cases.

03

Deciding if a value equals a rational constant is in NP ∩ coNP.

Abstract

We introduce two-level discounted games played by two players on a perfect-information stochastic game graph. The upper level game is a discounted game and the lower level game is an undiscounted reachability game. Two-level games model hierarchical and sequential decision making under uncertainty across different time scales. We show the existence of pure memoryless optimal strategies for both players and an ordered field property for such games. We show that if there is only one player (Markov decision processes), then the values can be computed in polynomial time. It follows that whether the value of a player is equal to a given rational constant in two-level discounted games can be decided in NP intersected coNP. We also give an alternate strategy improvement algorithm to compute the value.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.