Bootstrapped Representations in Reinforcement Learning
Charline Le Lan, Stephen Tu, Mark Rowland, Anna Harutyunyan, Rishabh, Agarwal, Marc G. Bellemare, Will Dabney

TL;DR
This paper provides a theoretical analysis of state representations learned by bootstrapping methods in reinforcement learning, compares them with other approaches, and introduces new auxiliary learning rules validated on classic domains.
Contribution
It offers the first theoretical characterization of representations learned by temporal difference learning and designs new auxiliary rules based on this analysis.
Findings
TD learning captures different features than Monte Carlo methods.
Auxiliary rules improve policy evaluation performance.
Empirical results on classic domains validate the proposed methods.
Abstract
In reinforcement learning (RL), state representations are key to dealing with large or continuous state spaces. While one of the promises of deep learning algorithms is to automatically construct features well-tuned for the task they try to solve, such a representation might not emerge from end-to-end training of deep RL agents. To mitigate this issue, auxiliary objectives are often incorporated into the learning process and help shape the learnt state representation. Bootstrapping methods are today's method of choice to make these additional predictions. Yet, it is unclear which features these algorithms capture and how they relate to those from other auxiliary-task-based approaches. In this paper, we address this gap and provide a theoretical characterization of the state representation learnt by temporal difference learning (Sutton, 1988). Surprisingly, we find that this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics
