Selective Credit Assignment
Veronica Chelu, Diana Borsa, Doina Precup, Hado van Hasselt

TL;DR
This paper presents a unified framework for selective credit assignment in reinforcement learning, introducing new algorithms that improve how credit is distributed backward in time, both on- and off-policy.
Contribution
It offers a unified view on temporal-difference algorithms for selective credit assignment and introduces novel algorithms for backward credit distribution in reinforcement learning.
Findings
Existing algorithms can be viewed as special cases of the proposed framework.
New algorithms enable credit assignment off-trajectory and off-policy.
Insights into applying weightings improve value-based learning and planning.
Abstract
Efficient credit assignment is essential for reinforcement learning algorithms in both prediction and control settings. We describe a unified view on temporal-difference algorithms for selective credit assignment. These selective algorithms apply weightings to quantify the contribution of learning updates. We present insights into applying weightings to value-based learning and planning algorithms, and describe their role in mediating the backward credit distribution in prediction and control. Within this space, we identify some existing online learning algorithms that can assign credit selectively as special cases, as well as add new algorithms that assign credit backward in time counterfactually, allowing credit to be assigned off-trajectory and off-policy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Smart Grid Energy Management · Reinforcement Learning in Robotics
