Demystifying the Recency Heuristic in Temporal-Difference Learning

Brett Daley; Marlos C. Machado; Martha White

arXiv:2406.12284·cs.LG·August 27, 2024

Demystifying the Recency Heuristic in Temporal-Difference Learning

Brett Daley, Marlos C. Machado, Martha White

PDF

Open Access 1 Repo

TL;DR

This paper provides a theoretical analysis of the recency heuristic in TD learning, showing it guarantees convergence, fast contraction, and effective credit assignment, while violating it can lead to divergence.

Contribution

It offers the first theoretical evidence that the recency heuristic in TD learning facilitates convergence and effective credit assignment.

Findings

01

Recency heuristic guarantees convergence to the correct value function.

02

It has a relatively fast contraction rate.

03

Violating the heuristic can cause divergence in TD methods.

Abstract

The recency heuristic in reinforcement learning is the assumption that stimuli that occurred closer in time to an acquired reward should be more heavily reinforced. The recency heuristic is one of the key assumptions made by TD( $λ$ ), which reinforces recent experiences according to an exponentially decaying weighting. In fact, all other widely used return estimators for TD learning, such as $n$ -step returns, satisfy a weaker (i.e., non-monotonic) recency heuristic. Why is the recency heuristic effective for temporal credit assignment? What happens when credit is assigned in a way that violates this heuristic? In this paper, we analyze the specific mathematical implications of adopting the recency heuristic in TD learning. We prove that any return estimator satisfying this heuristic: 1) is guaranteed to converge to the correct value function, 2) has a relatively fast contraction…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

brett-daley/recency-heuristic
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Language, Discourse, Communication Strategies