Markov Decision Processes with Time-Varying Geometric Discounting

Jiarui Gan; Annika Hennes; Rupak Majumdar; Debmalya Mandal; Goran; Radanovic

arXiv:2307.10491·cs.AI·July 21, 2023

Markov Decision Processes with Time-Varying Geometric Discounting

Jiarui Gan, Annika Hennes, Rupak Majumdar, Debmalya Mandal, Goran, Radanovic

PDF

1 Video

TL;DR

This paper explores infinite-horizon Markov decision processes with time-varying discount factors, analyzing the existence and computation of subgame perfect equilibria from a game-theoretic perspective.

Contribution

It introduces a game-theoretic framework for MDPs with time-varying discounting, proves the existence of equilibrium, and develops algorithms for approximate solutions.

Findings

01

Existence of subgame perfect equilibrium (SPE) in time-varying discount MDPs

02

Computational complexity of finding an SPE is EXPTIME-hard

03

An algorithm for computing $\\epsilon$-SPE with complexity bounds

Abstract

Canonical models of Markov decision processes (MDPs) usually consider geometric discounting based on a constant discount factor. While this standard modeling approach has led to many elegant results, some recent studies indicate the necessity of modeling time-varying discounting in certain applications. This paper studies a model of infinite-horizon MDPs with time-varying discount factors. We take a game-theoretic perspective -- whereby each time step is treated as an independent decision maker with their own (fixed) discount factor -- and we study the subgame perfect equilibrium (SPE) of the resulting game as well as the related algorithmic problems. We present a constructive proof of the existence of an SPE and demonstrate the EXPTIME-hardness of computing an SPE. We also turn to the approximate notion of $ϵ$ -SPE and show that an $ϵ$ -SPE exists under milder assumptions.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Markov Decision Processes with Time-Varying Geometric Discounting· underline