Unifying Goal-Conditioned RL and Unsupervised Skill Learning via Control-Maximization
Alireza Modirshanechi, Benjamin Eysenbach, Peter Dayan, Eric Schulz

TL;DR
This paper unifies goal-conditioned reinforcement learning and unsupervised skill learning through control maximization, providing a theoretical foundation and practical insights for RL pretraining.
Contribution
It identifies the fundamental differences among GCRL formulations, links MISL objectives to goal sensitivity, and guides the choice of pretraining objectives based on downstream tasks.
Findings
Three canonical GCRL formulations are fundamentally inequivalent.
MISL objectives are bounded by formulation-specific goal sensitivities.
More diverse skills lead to greater downstream goal sensitivity.
Abstract
Unsupervised pretraining has driven empirical advances in goal-conditioned reinforcement learning (GCRL), but its theoretical foundations remain poorly understood. In particular, an influential class of methods, mutual information skill learning (MISL), discovers behaviorally diverse skills that can later be used for downstream goal-reaching. However, it remains a theoretical mystery why skills learned through MISL should support goal-reaching. A subtle challenge is that both GCRL and MISL are umbrella terms: different GCRL tasks use distinct criteria for measuring goal-reaching performance, while different MISL methods optimize distinct notions of behavioral diversity. We address this challenge and unify GCRL and MISL as instances of control maximization. We identify three canonical GCRL formulations and prove that they are fundamentally inequivalent: they can induce incompatible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
