Decoupling Exploration and Exploitation for Unsupervised Pre-training with Successor Features
JaeYoon Kim, Junyu Xuan, Christy Liang, Farookh Hussain

TL;DR
This paper introduces NMPS, a novel unsupervised pre-training approach that decouples exploration and exploitation using successor features, leading to better adaptation and performance in reinforcement learning tasks.
Contribution
The paper proposes a non-monolithic exploration method for successor features, improving upon existing monolithic approaches in unsupervised pre-training.
Findings
NMPS outperforms APS in experiments.
Decoupling exploration and exploitation enhances adaptation.
Successor features facilitate task-agnostic exploration.
Abstract
Unsupervised pre-training has been on the lookout for the virtue of a value function representation referred to as successor features (SFs), which decouples the dynamics of the environment from the rewards. It has a significant impact on the process of task-specific fine-tuning due to the decomposition. However, existing approaches struggle with local optima due to the unified intrinsic reward of exploration and exploitation without considering the linear regression problem and the discriminator supporting a small skill sapce. We propose a novel unsupervised pre-training model with SFs based on a non-monolithic exploration methodology. Our approach pursues the decomposition of exploitation and exploration of an agent built on SFs, which requires separate agents for the respective purpose. The idea will leverage not only the inherent characteristics of SFs such as a quick adaptation to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification
MethodsLinear Regression
