A Unifying Framework for Action-Conditional Self-Predictive Reinforcement Learning
Khimya Khetarpal, Zhaohan Daniel Guo, Bernardo Avila Pires, Yunhao, Tang, Clare Lyle, Mark Rowland, Nicolas Heess, Diana Borsa, Arthur Guez, Will, Dabney

TL;DR
This paper develops a unified theoretical framework for action-conditional self-predictive reinforcement learning, bridging the gap between existing theory and practical algorithms, and demonstrating improved empirical performance across various settings.
Contribution
It introduces a new action-conditional objective (BYOL-AC), analyzes its convergence, and unifies different objectives through model-based and model-free perspectives.
Findings
BYOL-AC outperforms existing methods in diverse RL environments.
Theoretical analysis reveals convergence properties and relationships between objectives.
Proposes a variance-like objective (BYOL-VAR) with favorable properties.
Abstract
Learning a good representation is a crucial challenge for Reinforcement Learning (RL) agents. Self-predictive learning provides means to jointly learn a latent representation and dynamics model by bootstrapping from future latent representations (BYOL). Recent work has developed theoretical insights into these algorithms by studying a continuous-time ODE model for self-predictive representation learning under the simplifying assumption that the algorithm depends on a fixed policy (BYOL-); this assumption is at odds with practical instantiations of such algorithms, which explicitly condition their predictions on future actions. In this work, we take a step towards bridging the gap between theory and practice by analyzing an action-conditional self-predictive objective (BYOL-AC) using the ODE framework, characterizing its convergence properties and highlighting important distinctions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Reinforcement Learning in Robotics
