Augmenting Unsupervised Reinforcement Learning with Self-Reference
Andrew Zhao, Erle Zhu, Rui Lu, Matthieu Lin, Yong-Jin Liu, Gao Huang

TL;DR
This paper introduces the Self-Reference (SR) module that leverages past experiences to improve unsupervised reinforcement learning, achieving state-of-the-art results and increased sample efficiency in benchmark tasks.
Contribution
The paper proposes the Self-Reference (SR) add-on module that explicitly uses historical data to enhance performance and sample efficiency in unsupervised reinforcement learning.
Findings
Achieved 86% IQM and 16% Optimality Gap on the benchmark.
Improved existing algorithms by up to 17% IQM.
Reduced the Optimality Gap by 31%.
Abstract
Humans possess the ability to draw on past experiences explicitly when learning new tasks and applying them accordingly. We believe this capacity for self-referencing is especially advantageous for reinforcement learning agents in the unsupervised pretrain-then-finetune setting. During pretraining, an agent's past experiences can be explicitly utilized to mitigate the nonstationarity of intrinsic rewards. In the finetuning phase, referencing historical trajectories prevents the unlearning of valuable exploratory behaviors. Motivated by these benefits, we propose the Self-Reference (SR) approach, an add-on module explicitly designed to leverage historical information and enhance agent performance within the pretrain-finetune paradigm. Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
