RePreM: Representation Pre-training with Masked Model for Reinforcement   Learning

Yuanying Cai; Chuheng Zhang; Wei Shen; Xuyun Zhang; Wenjie Ruan,; Longbo Huang

arXiv:2303.01668·cs.LG·March 6, 2023·1 cites

RePreM: Representation Pre-training with Masked Model for Reinforcement Learning

Yuanying Cai, Chuheng Zhang, Wei Shen, Xuyun Zhang, Wenjie Ruan,, Longbo Huang

PDF

Open Access 1 Video

TL;DR

RePreM introduces a masked sequence modeling approach for pre-training in reinforcement learning, improving long-term dynamics understanding and transferability across tasks without complex algorithms.

Contribution

It presents a simple, effective pre-training method using masked modeling in RL that captures long-term dynamics and scales well with data and model size.

Findings

01

RePreM enhances dynamic prediction accuracy.

02

It improves transfer learning performance.

03

RePreM enables sample-efficient RL with various algorithms.

Abstract

Inspired by the recent success of sequence modeling in RL and the use of masked language model for pre-training, we propose a masked model for pre-training in RL, RePreM (Representation Pre-training with Masked Model), which trains the encoder combined with transformer blocks to predict the masked states or actions in a trajectory. RePreM is simple but effective compared to existing representation pre-training methods in RL. It avoids algorithmic sophistication (such as data augmentation or estimating multiple models) with sequence modeling and generates a representation that captures long-term dynamics well. Empirically, we demonstrate the effectiveness of RePreM in various tasks, including dynamic prediction, transfer learning, and sample-efficient RL with both value-based and actor-critic methods. Moreover, we show that RePreM scales well with dataset size, dataset quality, and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RePreM: Representation Pre-training with Masked Model for Reinforcement Learning· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Software Engineering Research · Topic Modeling