Variance Reduced Advantage Estimation with $\delta$ Hindsight Credit   Assignment

Kenny Young

arXiv:1911.08362·cs.LG·September 29, 2020·1 cites

Variance Reduced Advantage Estimation with $\delta$ Hindsight Credit Assignment

Kenny Young

PDF

Open Access

TL;DR

This paper introduces a variance-reduced Hindsight Credit Assignment algorithm for reinforcement learning, providing theoretical guarantees of lower variance and demonstrating its potential for more efficient credit assignment.

Contribution

The paper proposes a new HCA algorithm with provably lower variance than Monte-Carlo estimators when functions are estimated exactly, advancing theoretical understanding.

Findings

01

The proposed method has lower variance than Monte-Carlo estimators.

02

Theoretical proof of variance reduction under certain conditions.

03

Empirical demonstrations of improved credit assignment efficiency.

Abstract

Hindsight Credit Assignment (HCA) refers to a recently proposed family of methods for producing more efficient credit assignment in reinforcement learning. These methods work by explicitly estimating the probability that certain actions were taken in the past given present information. Prior work has studied the properties of such methods and demonstrated their behaviour empirically. We extend this work by introducing a particular HCA algorithm which has provably lower variance than the conventional Monte-Carlo estimator when the necessary functions can be estimated exactly. This result provides a strong theoretical basis for how HCA could be broadly useful.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFinancial Distress and Bankruptcy Prediction · Reinforcement Learning in Robotics · Risk and Portfolio Optimization