Deep RL With Information Constrained Policies: Generalization in Continuous Control

Tailia Malloy; Chris R. Sims; Tim Klinger; Miao Liu; Matthew Riemer; Gerald Tesauro

arXiv:2010.04646·cs.LG·May 16, 2025·5 cites

Deep RL With Information Constrained Policies: Generalization in Continuous Control

Tailia Malloy, Chris R. Sims, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro

PDF

Open Access

TL;DR

This paper introduces a novel reinforcement learning algorithm, CLAC, that incorporates information constraints inspired by biological agents, leading to improved generalization in continuous control tasks without sacrificing sample efficiency.

Contribution

The paper formalizes an information-theoretic constraint on policies, develops the CLAC algorithm, and demonstrates its effectiveness in enhancing generalization in continuous control environments.

Findings

01

CLAC outperforms alternative methods in generalization between training and test environments.

02

CLAC maintains high sample efficiency comparable to existing algorithms.

03

The approach is grounded in rate-distortion theory, providing a principled framework.

Abstract

Biological agents learn and act intelligently in spite of a highly limited capacity to process and store information. Many real-world problems involve continuous control, which represents a difficult task for artificial intelligence agents. In this paper we explore the potential learning advantages a natural constraint on information flow might confer onto artificial agents in continuous control tasks. We focus on the model-free reinforcement learning (RL) setting and formalize our approach in terms of an information-theoretic constraint on the complexity of learned policies. We show that our approach emerges in a principled fashion from the application of rate-distortion theory. We implement a novel Capacity-Limited Actor-Critic (CLAC) algorithm and situate it within a broader family of RL algorithms such as the Soft Actor Critic (SAC) and Mutual Information Reinforcement Learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Evolutionary Algorithms and Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Experience Replay · Adam · Dense Connections · Soft Actor Critic