Augmenting Unsupervised Reinforcement Learning with Self-Reference

Andrew Zhao; Erle Zhu; Rui Lu; Matthieu Lin; Yong-Jin Liu; Gao Huang

arXiv:2311.09692·cs.LG·November 17, 2023·1 cites

Augmenting Unsupervised Reinforcement Learning with Self-Reference

Andrew Zhao, Erle Zhu, Rui Lu, Matthieu Lin, Yong-Jin Liu, Gao Huang

PDF

Open Access

TL;DR

This paper introduces the Self-Reference (SR) module that leverages past experiences to improve unsupervised reinforcement learning, achieving state-of-the-art results and increased sample efficiency in benchmark tasks.

Contribution

The paper proposes the Self-Reference (SR) add-on module that explicitly uses historical data to enhance performance and sample efficiency in unsupervised reinforcement learning.

Findings

01

Achieved 86% IQM and 16% Optimality Gap on the benchmark.

02

Improved existing algorithms by up to 17% IQM.

03

Reduced the Optimality Gap by 31%.

Abstract

Humans possess the ability to draw on past experiences explicitly when learning new tasks and applying them accordingly. We believe this capacity for self-referencing is especially advantageous for reinforcement learning agents in the unsupervised pretrain-then-finetune setting. During pretraining, an agent's past experiences can be explicitly utilized to mitigate the nonstationarity of intrinsic rewards. In the finetuning phase, referencing historical trajectories prevents the unlearning of valuable exploratory behaviors. Motivated by these benefits, we propose the Self-Reference (SR) approach, an add-on module explicitly designed to leverage historical information and enhance agent performance within the pretrain-finetune paradigm. Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics