Value-Consistent Representation Learning for Data-Efficient   Reinforcement Learning

Yang Yue; Bingyi Kang; Zhongwen Xu; Gao Huang; Shuicheng Yan

arXiv:2206.12542·cs.LG·August 17, 2022·1 cites

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning

Yang Yue, Bingyi Kang, Zhongwen Xu, Gao Huang, Shuicheng Yan

PDF

Open Access 1 Video

TL;DR

This paper introduces VCR, a novel representation learning method that aligns imagined future states with real states through value prediction, significantly enhancing data efficiency in reinforcement learning.

Contribution

VCR directly optimizes state representations for decision-making by aligning value predictions of imagined and real states, a novel approach compared to traditional contrastive methods.

Findings

01

Achieves state-of-the-art results on Atari 100K benchmarks.

02

Improves sample efficiency in DeepMind Control Suite tasks.

03

Effective for both discrete and continuous action spaces.

Abstract

Deep reinforcement learning (RL) algorithms suffer severe performance degradation when the interaction data is scarce, which limits their real-world application. Recently, visual representation learning has been shown to be effective and promising for boosting sample efficiency in RL. These methods usually rely on contrastive learning and data augmentation to train a transition model for state prediction, which is different from how the model is used in RL--performing value-based planning. Accordingly, the learned representation by these visual methods may be good for recognition but not optimal for estimating state value and solving the decision problem. To address this issue, we propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making. More specifically, VCR trains a model to predict the future…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Value-Consistent Representation Learning for Data-Efficient Reinforcement Learning· underline

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsALIGN · Contrastive Learning