Disentangling Syntax and Semantics in the Brain with Deep Networks
Charlotte Caucheteux, Alexandre Gramfort, Jean-Remi King

TL;DR
This study introduces a framework to disentangle and analyze the neural representations of syntax and semantics in the brain using deep language models and fMRI data, revealing shared neural substrates.
Contribution
It proposes a taxonomy and statistical method to decompose language model activations and brain activity into distinct linguistic classes, challenging previous modular assumptions.
Findings
Compositional representations involve widespread cortical areas.
Syntax and semantics share a common neural substrate.
The framework isolates distributed linguistic representations in brain activity.
Abstract
The activations of language transformers like GPT-2 have been shown to linearly map onto brain activity during speech comprehension. However, the nature of these activations remains largely unknown and presumably conflate distinct linguistic classes. Here, we propose a taxonomy to factorize the high-dimensional activations of language models into four combinatorial classes: lexical, compositional, syntactic, and semantic representations. We then introduce a statistical method to decompose, through the lens of GPT-2's activations, the brain activity of 345 subjects recorded with functional magnetic resonance imaging (fMRI) during the listening of ~4.6 hours of narrated text. The results highlight two findings. First, compositional representations recruit a more widespread cortical network than lexical ones, and encompass the bilateral temporal, parietal and prefrontal cortices. Second,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsNeurobiology of Language and Bilingualism · Topic Modeling · Natural Language Processing Techniques
MethodsAttention Is All You Need · Linear Layer · Cosine Annealing · Refunds@Expedia|||How do I get a full refund from Expedia? · Residual Connection · Linear Warmup With Cosine Annealing · Attention Dropout · Dense Connections · Softmax · Dropout
