Reward-Predictive Clustering

Lucas Lehnert; Michael J. Frank; Michael L. Littman

arXiv:2211.03281·cs.LG·November 8, 2022

Reward-Predictive Clustering

Lucas Lehnert, Michael J. Frank, Michael L. Littman

PDF

Open Access

TL;DR

This paper introduces a clustering algorithm that enables reward-predictive state abstractions to be used in deep learning, significantly improving learning speed and transferability in high-dimensional reinforcement learning tasks.

Contribution

It extends reward-predictive state abstractions from tabular to deep learning settings with a new clustering algorithm and provides theoretical and empirical validation.

Findings

01

Deep reward-predictive networks compress inputs effectively.

02

Significant acceleration of learning in visual control tasks.

03

Pre-trained representations can be reused for transfer learning.

Abstract

Recent advances in reinforcement-learning research have demonstrated impressive results in building algorithms that can out-perform humans in complex tasks. Nevertheless, creating reinforcement-learning systems that can build abstractions of their experience to accelerate learning in new contexts still remains an active area of research. Previous work showed that reward-predictive state abstractions fulfill this goal, but have only be applied to tabular settings. Here, we provide a clustering algorithm that enables the application of such state abstractions to deep learning settings, providing compressed representations of an agent's inputs that preserve the ability to predict sequences of reward. A convergence theorem and simulations show that the resulting reward-predictive deep network maximally compresses the agent's inputs, significantly speeding up learning in high dimensional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural dynamics and brain function