Context-Gated Associative Retrieval: From Theory to Transformers
Moulik Choraria, Argyrios Gerogiannis, Vidhata Jayaraman, Ankur Mani, Lav R. Varshney

TL;DR
This paper introduces a context-gated associative memory model that enhances retrieval by incorporating external context, with theoretical and empirical validation linking it to transformer-based in-context learning.
Contribution
It proposes a novel two-stage memory architecture with context gating, providing theoretical insights and demonstrating its relevance to transformer models like Llama-3.
Findings
Context gating increases memory separation and sparsity.
The system admits a unique fixed point driven by context and feedback.
In-context learning in transformers acts as context-gated retrieval.
Abstract
Hopfield networks and their generalizations have established deep connections among biological associative memories, statistical physics, and transformers. Yet most models treat retrieval as a fixed query-to-memory mapping, ignoring the role of external context in recall. In this work, we propose a two-stage associative memory architecture, wherein a context-gate subcircuit reshapes the retrieval energy landscape before and during recall. We show theoretically that context gating increases inter-memory separation while inducing sparsity, translating into exponential improvements in retrieval. Crucially, we prove that the system admits a unique self-consistent fixed point, revealing that the resulting retrieval state is driven by both a direct contextual bias and a second-order retrieval-gate feedback loop. We then bridge this theory to transformers; specifically, we evaluate a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
