Predictive Information Accelerates Learning in RL
Kuang-Huei Lee, Ian Fischer, Anthony Liu, Yijie Guo, Honglak Lee, John, Canny, Sergio Guadarrama

TL;DR
This paper introduces PI-SAC, an RL method that uses predictive information as an auxiliary task to improve sample efficiency in pixel-based continuous control tasks.
Contribution
It proposes a novel approach that incorporates predictive information via a contrastive CEB objective into SAC, enhancing learning efficiency in RL from pixel inputs.
Findings
PI-SAC significantly outperforms baselines in sample efficiency.
Using predictive information improves RL performance on DM Control tasks.
Contrastive CEB effectively compresses environment dynamics information.
Abstract
The Predictive Information is the mutual information between the past and the future, I(X_past; X_future). We hypothesize that capturing the predictive information is useful in RL, since the ability to model what will happen next is necessary for success on many tasks. To test our hypothesis, we train Soft Actor-Critic (SAC) agents from pixels with an auxiliary task that learns a compressed representation of the predictive information of the RL environment dynamics using a contrastive version of the Conditional Entropy Bottleneck (CEB) objective. We refer to these as Predictive Information SAC (PI-SAC) agents. We show that PI-SAC agents can substantially improve sample efficiency over challenging baselines on tasks from the DM Control suite of continuous control environments. We evaluate PI-SAC agents by comparing against uncompressed PI-SAC agents, other compressed and uncompressed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsFuzzy Logic and Control Systems · Reinforcement Learning in Robotics · Intelligent Tutoring Systems and Adaptive Learning
MethodsDilated Convolution · Global Average Pooling · Average Pooling · Convolution · 1x1 Convolution · Switchable Atrous Convolution
