Expected Eligibility Traces

Hado van Hasselt; Sephora Madjiheurem; Matteo Hessel; David Silver,; Andr\'e Barreto; Diana Borsa

arXiv:2007.01839·cs.LG·February 9, 2021

Expected Eligibility Traces

Hado van Hasselt, Sephora Madjiheurem, Matteo Hessel, David Silver,, Andr\'e Barreto, Diana Borsa

PDF

1 Video

TL;DR

This paper introduces expected eligibility traces in reinforcement learning, enabling more effective credit assignment by updating not only recent states but also plausible preceding states, potentially improving learning efficiency.

Contribution

The work proposes a novel expected eligibility trace method that generalizes traditional traces, allowing updates to counterfactual states and actions with a smooth interpolation mechanism.

Findings

01

Expected traces can outperform classic traces in certain scenarios.

02

The interpolation mechanism generalizes TD(λ) and enhances credit assignment.

03

Potential connections to successor features are discussed.

Abstract

The question of how to determine which states and actions are responsible for a certain outcome is known as the credit assignment problem and remains a central research question in reinforcement learning and artificial intelligence. Eligibility traces enable efficient credit assignment to the recent sequence of states and actions experienced by the agent, but not to counterfactual sequences that could also have led to the current state. In this work, we introduce expected eligibility traces. Expected traces allow, with a single update, to update states and actions that could have preceded the current state, even if they did not do so on this occasion. We discuss when expected traces provide benefits over classic (instantaneous) traces in temporal-difference learning, and show that sometimes substantial improvements can be attained. We provide a way to smoothly interpolate between…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Expected Eligibility Traces· underline