Predictive Information Accelerates Learning in RL

Kuang-Huei Lee; Ian Fischer; Anthony Liu; Yijie Guo; Honglak Lee; John; Canny; Sergio Guadarrama

arXiv:2007.12401·cs.LG·October 27, 2020·24 cites

Predictive Information Accelerates Learning in RL

Kuang-Huei Lee, Ian Fischer, Anthony Liu, Yijie Guo, Honglak Lee, John, Canny, Sergio Guadarrama

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces PI-SAC, an RL method that uses predictive information as an auxiliary task to improve sample efficiency in pixel-based continuous control tasks.

Contribution

It proposes a novel approach that incorporates predictive information via a contrastive CEB objective into SAC, enhancing learning efficiency in RL from pixel inputs.

Findings

01

PI-SAC significantly outperforms baselines in sample efficiency.

02

Using predictive information improves RL performance on DM Control tasks.

03

Contrastive CEB effectively compresses environment dynamics information.

Abstract

The Predictive Information is the mutual information between the past and the future, I(X_past; X_future). We hypothesize that capturing the predictive information is useful in RL, since the ability to model what will happen next is necessary for success on many tasks. To test our hypothesis, we train Soft Actor-Critic (SAC) agents from pixels with an auxiliary task that learns a compressed representation of the predictive information of the RL environment dynamics using a contrastive version of the Conditional Entropy Bottleneck (CEB) objective. We refer to these as Predictive Information SAC (PI-SAC) agents. We show that PI-SAC agents can substantially improve sample efficiency over challenging baselines on tasks from the DM Control suite of continuous control environments. We evaluate PI-SAC agents by comparing against uncompressed PI-SAC agents, other compressed and uncompressed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-research/pisac
tfOfficial

Videos

Predictive Information Accelerates Learning in RL· slideslive

Taxonomy

TopicsFuzzy Logic and Control Systems · Reinforcement Learning in Robotics · Intelligent Tutoring Systems and Adaptive Learning

MethodsDilated Convolution · Global Average Pooling · Average Pooling · Convolution · 1x1 Convolution · Switchable Atrous Convolution