Reinforcement Learning with Quasi-Hyperbolic Discounting

S.R. Eshwar; Mayank Motwani; Nibedita Roy; Gugan Thoppe

arXiv:2409.10583·cs.LG·September 18, 2024

Reinforcement Learning with Quasi-Hyperbolic Discounting

S.R. Eshwar, Mayank Motwani, Nibedita Roy, Gugan Thoppe

PDF

Open Access

TL;DR

This paper introduces the first model-free reinforcement learning algorithm to find Markov Perfect Equilibria under Quasi-Hyperbolic discounting, capturing human-like impatience and improving practical RL applications.

Contribution

It presents a novel model-free algorithm for computing MPE in QH-discounted RL, with theoretical convergence analysis and numerical validation.

Findings

01

Algorithm converges to MPE if it converges

02

Numerical validation in inventory system

03

Advances practical RL with human-like discounting

Abstract

Reinforcement learning has traditionally been studied with exponential discounting or the average reward setup, mainly due to their mathematical tractability. However, such frameworks fall short of accurately capturing human behavior, which has a bias towards immediate gratification. Quasi-Hyperbolic (QH) discounting is a simple alternative for modeling this bias. Unlike in traditional discounting, though, the optimal QH-policy, starting from some time $t_{1},$ can be different to the one starting from $t_{2} .$ Hence, the future self of an agent, if it is naive or impatient, can deviate from the policy that is optimal at the start, leading to sub-optimal overall returns. To prevent this behavior, an alternative is to work with a policy anchored in a Markov Perfect Equilibrium (MPE). In this work, we propose the first model-free algorithm for finding an MPE. Using a two-timescale analysis,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsComplex Systems and Time Series Analysis · Economic theories and models · Stochastic processes and financial applications