Rethinking Associative Memory Mechanism in Induction Head
Shuo Wang, Issei Sato

TL;DR
This paper investigates how a two-layer transformer captures in-context information and balances it with pretrained bigram knowledge, providing theoretical analysis and experimental validation of associative memory mechanisms in in-context learning.
Contribution
It offers a theoretical analysis of transformer attention weights and logits in the context of associative memory, complemented by experiments with specially designed prompts.
Findings
Transformers encode in-context information and bigram knowledge in attention weights.
Theoretical predictions align with experimental results on prompt outputs.
Insights into the balance between in-context learning and pretrained knowledge.
Abstract
Induction head mechanism is a part of the computational circuits for in-context learning (ICL) that enable large language models (LLMs) to adapt to new tasks without fine-tuning. Most existing work explains the training dynamics behind acquiring such a powerful mechanism. However, the model's ability to coordinate in-context information over long contexts and global knowledge acquired during pretraining remains poorly understood. This paper investigates how a two-layer transformer thoroughly captures in-context information and balances it with pretrained bigram knowledge in next token prediction, from the viewpoint of associative memory. We theoretically analyze the representation of weight matrices in attention layers and the resulting logits when a transformer is given prompts generated by a bigram model. In the experiments, we design specific prompts to evaluate whether the outputs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning
MethodsALIGN
