Discounting the Past

Taylor Dohmen; Ashutosh Trivedi

arXiv:2102.06985·cs.GT·October 22, 2021·1 cites

Discounting the Past

Taylor Dohmen, Ashutosh Trivedi

PDF

Open Access

TL;DR

This paper introduces a novel concept of past-discounting in stochastic games, analyzing its impact on strategy complexity and establishing determinacy results for certain objectives, with implications for game theory and decision-making models.

Contribution

It proposes the concept of past-discounting, studies its effects on game determinacy, and provides reductions to standard models for certain objectives.

Findings

01

Positional determinacy fails for liminf of past-discounted rewards.

02

Optimal strategies may require unbounded memory in some cases.

03

Determinacy holds for discounted and average limits of past-discounted rewards with stationary strategies.

Abstract

Stochastic games with discounted payoff, introduced by Shapley, model adversarial interactions in stochastic environments where two players try to optimize a discounted sum of rewards. In this model, long-term weights are geometrically attenuated based on the delay in their occurrence. We propose a temporally dual notion -- called past-discounting -- where agents have geometrically decaying memory of the rewards encountered during a play of the game. We study objective functions based on past-discounted weight sequences and examine the corresponding stochastic games with liminf, discounted, and mean payoffs. For objectives specified as the limit inferior of past-discounted reward sequences, we show that positional determinacy fails and that optimal strategies may require unbounded memory. To overcome this obstacle, we study an approximate windowed objective based on the idea of using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Reinforcement Learning in Robotics