Reinforcement Learning with Non-Exponential Discounting

Matthias Schultheis; Constantin A. Rothkopf; Heinz Koeppl

arXiv:2209.13413·cs.LG·December 8, 2022·1 cites

Reinforcement Learning with Non-Exponential Discounting

Matthias Schultheis, Constantin A. Rothkopf, Heinz Koeppl

PDF

Open Access 1 Video

TL;DR

This paper develops a continuous-time reinforcement learning framework that accommodates arbitrary discount functions, including hyperbolic discounting, using a Hamilton-Jacobi-Bellman equation and deep learning methods.

Contribution

It introduces a generalized RL theory for non-exponential discounting, deriving an HJB equation and a collocation-based solution approach, and explores inverse RL for discount function recovery.

Findings

01

Validated on two simulated problems.

02

Demonstrated applicability to non-exponential discounting.

03

Provided a method for analyzing human decision-making patterns.

Abstract

Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time distribution is assumed. In this work, we propose a theory for continuous-time model-based reinforcement learning generalized to arbitrary discount functions. This formulation covers the case in which there is a non-exponential random termination time. We derive a Hamilton-Jacobi-Bellman (HJB) equation characterizing the optimal policy and describe how it can be solved using a collocation method, which uses deep learning for function approximation. Further, we show how the inverse RL problem can be approached, in which one tries to recover…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Reinforcement Learning with Non-Exponential Discounting· slideslive

Taxonomy

TopicsDecision-Making and Behavioral Economics