Evidence Against Syntactic Encapsulation in Large Language Models

Thomas A. McGee; Yiyang Zhang; Idan A. Blank

PMC · DOI:10.1111/cogs.70187·March 10, 2026

Evidence Against Syntactic Encapsulation in Large Language Models

Thomas A. McGee, Yiyang Zhang, Idan A. Blank

PDF

Open Access

TL;DR

This paper shows that syntax-specialized components in large language models are influenced by semantic information, similar to how humans process language.

Contribution

The study provides evidence against syntactic encapsulation in LLMs by showing semantic modulation of syntax-specialized attention heads.

Findings

01

Implausible semantic information reduces attention in syntax-specialized heads across BERT, GPT-2, and Llama 2.

02

Syntax-specialized heads are not fully encapsulated from semantic influences.

03

Findings align with human-like integration of syntax and semantics.

Abstract

Transformer‐based large language models (LLMs) have recently demonstrated exceptional performance in a variety of linguistic tasks. LLMs primarily combine information across words in a sentence using the attention mechanism, implemented by “attention heads:” these components assign numerical weights linking different words in the input to one another, capturing different relationships between these words. Some attention heads automatically learn to assign weights that accurately encode meaningful linguistic features including, importantly, heads that appear specialized for identifying particular syntactic dependencies. Are syntactic computations in such heads “encapsulated”, i.e., impenetrable to the influence of non‐syntactic information? Such encapsulated computations would be strikingly different from those of the human mind, where non‐syntactic information sources (e.g., semantics)…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes1

GPT2

Proteins1

Species2

Homo sapiens(human · species)Lama glama(llama · species)

Chemicals1

BERT

Diseases2

aphasia LLMs

Figures3

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Text Readability and Simplification · Multimodal Machine Learning Applications