Selective Credit Assignment

Veronica Chelu; Diana Borsa; Doina Precup; Hado van Hasselt

arXiv:2202.09699·cs.LG·February 22, 2022·1 cites

Selective Credit Assignment

Veronica Chelu, Diana Borsa, Doina Precup, Hado van Hasselt

PDF

Open Access

TL;DR

This paper presents a unified framework for selective credit assignment in reinforcement learning, introducing new algorithms that improve how credit is distributed backward in time, both on- and off-policy.

Contribution

It offers a unified view on temporal-difference algorithms for selective credit assignment and introduces novel algorithms for backward credit distribution in reinforcement learning.

Findings

01

Existing algorithms can be viewed as special cases of the proposed framework.

02

New algorithms enable credit assignment off-trajectory and off-policy.

03

Insights into applying weightings improve value-based learning and planning.

Abstract

Efficient credit assignment is essential for reinforcement learning algorithms in both prediction and control settings. We describe a unified view on temporal-difference algorithms for selective credit assignment. These selective algorithms apply weightings to quantify the contribution of learning updates. We present insights into applying weightings to value-based learning and planning algorithms, and describe their role in mediating the backward credit distribution in prediction and control. Within this space, we identify some existing online learning algorithms that can assign credit selectively as special cases, as well as add new algorithms that assign credit backward in time counterfactually, allowing credit to be assigned off-trajectory and off-policy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Smart Grid Energy Management · Reinforcement Learning in Robotics