Understanding Transformers through the Lens of Pavlovian Conditioning
Mu Qiao

TL;DR
This paper introduces a novel theoretical framework that interprets transformer attention mechanisms as Pavlovian conditioning, providing insights into their associative memory capabilities and architectural trade-offs.
Contribution
It reinterprets attention as an associative process akin to Pavlovian conditioning, offering new theoretical insights and capacity bounds for transformer models.
Findings
Attention can store O(√d_k) associations with error-free retrieval.
Average-case retrieval fidelity scales as O(d_k).
The framework reveals fundamental trade-offs in transformer architecture design.
Abstract
Transformer architectures have revolutionized artificial intelligence (AI) through their attention mechanisms, yet the computational principles underlying their success remain opaque. We present a novel theoretical framework that reinterprets the core computation of attention as Pavlovian conditioning. Our model finds a direct mathematical analogue in linear attention, which simplifies the analysis of the underlying associative process. We demonstrate that attention's queries, keys, and values can be mapped to the three elements of classical conditioning: test stimuli that probe associations, conditional stimuli (CS) that serve as retrieval cues, and unconditional stimuli (US) that contain response information. Through this lens, we suggest that each attention operation constructs a transient associative memory via a Hebbian rule, where CS-US pairs form dynamic associations that test…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
