Deep RL With Information Constrained Policies: Generalization in Continuous Control
Tailia Malloy, Chris R. Sims, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro

TL;DR
This paper introduces a novel reinforcement learning algorithm, CLAC, that incorporates information constraints inspired by biological agents, leading to improved generalization in continuous control tasks without sacrificing sample efficiency.
Contribution
The paper formalizes an information-theoretic constraint on policies, develops the CLAC algorithm, and demonstrates its effectiveness in enhancing generalization in continuous control environments.
Findings
CLAC outperforms alternative methods in generalization between training and test environments.
CLAC maintains high sample efficiency comparable to existing algorithms.
The approach is grounded in rate-distortion theory, providing a principled framework.
Abstract
Biological agents learn and act intelligently in spite of a highly limited capacity to process and store information. Many real-world problems involve continuous control, which represents a difficult task for artificial intelligence agents. In this paper we explore the potential learning advantages a natural constraint on information flow might confer onto artificial agents in continuous control tasks. We focus on the model-free reinforcement learning (RL) setting and formalize our approach in terms of an information-theoretic constraint on the complexity of learned policies. We show that our approach emerges in a principled fashion from the application of rate-distortion theory. We implement a novel Capacity-Limited Actor-Critic (CLAC) algorithm and situate it within a broader family of RL algorithms such as the Soft Actor Critic (SAC) and Mutual Information Reinforcement Learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Evolutionary Algorithms and Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · Adam · Dense Connections · Soft Actor Critic
