From Memories to Maps: Mechanisms of In-Context Reinforcement Learning in Transformers
Ching Fang, Kanaka Rajan

TL;DR
This paper investigates how transformers can emulate episodic memory to enable rapid, in-context reinforcement learning, revealing that memory caching supports flexible decision-making akin to hippocampal functions.
Contribution
It introduces a transformer-based model that uses memory caching for in-context reinforcement learning, offering a mechanistic understanding of rapid adaptation and representation alignment.
Findings
Memory supports in-context learning through caching computations.
Representations align across different environments, resembling hippocampal functions.
In-context RL strategies differ from standard model-free or model-based approaches.
Abstract
Humans and animals show remarkable learning efficiency, adapting to new environments with minimal experience. This capability is not well captured by standard reinforcement learning algorithms that rely on incremental value updates. Rapid adaptation likely depends on episodic memory -- the ability to retrieve specific past experiences to guide decisions in novel contexts. Transformers provide a useful setting for studying these questions because of their ability to learn rapidly in-context and because their key-value architecture resembles episodic memory systems in the brain. We train a transformer to in-context reinforcement learn in a distribution of planning tasks inspired by rodent behavior. We then characterize the learning algorithms that emerge in the model. We first find that representation learning is supported by in-context structure learning and cross-context alignment,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
