InfoBot: Transfer and Exploration via the Information Bottleneck

Anirudh Goyal; Riashat Islam; Daniel Strouse; Zafarali Ahmed; Matthew; Botvinick; Hugo Larochelle; Yoshua Bengio; Sergey Levine

arXiv:1901.10902·stat.ML·December 7, 2023·46 cites

InfoBot: Transfer and Exploration via the Information Bottleneck

Anirudh Goyal, Riashat Islam, Daniel Strouse, Zafarali Ahmed, Matthew, Botvinick, Hugo Larochelle, Yoshua Bengio, Sergey Levine

PDF

Open Access

TL;DR

This paper introduces InfoBot, a method that uses an information bottleneck to identify decision states in reinforcement learning, enhancing exploration especially in sparse reward environments by leveraging prior experience to find subgoals.

Contribution

The paper proposes a novel approach that learns decision states via an information bottleneck, enabling better exploration and transfer in reinforcement learning tasks.

Findings

01

Effectively identifies decision states in various environments.

02

Improves exploration by guiding agents to potential subgoals.

03

Works well even with partial observations.

Abstract

A central challenge in reinforcement learning is discovering effective policies for tasks where rewards are sparsely distributed. We postulate that in the absence of useful reward signals, an effective exploration strategy should seek out {\it decision states}. These states lie at critical junctions in the state space from where the agent can transition to new, potentially unexplored regions. We propose to learn about decision states from prior experience. By training a goal-conditioned policy with an information bottleneck, we can identify decision states by examining where the model actually leverages the goal state. We find that this simple mechanism effectively identifies decision states, even in partially observed settings. In effect, the model learns the sensory cues that correlate with potential subgoals. In new environments, this model can then identify novel subgoals for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · stochastic dynamics and bifurcation