The Neuroscience of Transformers
Peter Koenig, Mario Negrello

TL;DR
This paper explores how transformer models can serve as a computational analogy for cortical microcircuit organization, generating testable hypotheses about brain function and fostering cross-disciplinary insights.
Contribution
It proposes a hypothetical mapping between transformer operations and cortical features, bridging neuroscience and AI to enhance understanding of brain computation.
Findings
Predictions about laminar specialization and contextual modulation
Hypotheses on dendritic integration and oscillatory coordination
Framework for reciprocal insights between neuroscience and AI
Abstract
Neuroscience has long informed the development of artificial neural networks, but the success of modern architectures invites, in turn, the converse: can modern networks teach us lessons about brain function? Here, we examine the structure of the cortical column and propose that the transformer provides a natural computational analogy for multiple elements of cortical microcircuit organization. Rather than claiming a literal implementation of transformer equations in cortex, we develop a hypothetical mapping between transformer operations and laminar cortical features, using the analogy as an orienting framework for analysis and discussion. This mapping allows us to examine in greater depth how contextual selection, content routing, recurrent integration, and interlaminar transformations may be distributed across cortical circuitry. In doing so, we generate a broad set of predictions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFunctional Brain Connectivity Studies · Neural dynamics and brain function · Ferroelectric and Negative Capacitance Devices
