How Syntax Specialization Emerges in Language Models

Xufeng Duan; Zhaoqian Yao; Yunhao Zhang; Shaonan Wang; Zhenguang G. Cai

arXiv:2505.19548·cs.CL·May 27, 2025

How Syntax Specialization Emerges in Language Models

Xufeng Duan, Zhaoqian Yao, Yunhao Zhang, Shaonan Wang, Zhenguang G. Cai

PDF

Open Access

TL;DR

This paper investigates how large language models develop internal syntactic specialization during training, revealing a gradual emergence concentrated in specific layers, influenced by model size and training data.

Contribution

It provides the first detailed analysis of the developmental trajectory of syntactic specialization in LLMs, showing how it emerges and what factors influence it.

Findings

01

Syntactic sensitivity emerges gradually during training.

02

Specialization concentrates in specific layers.

03

Development is influenced by model scale and training data.

Abstract

Large language models (LLMs) have been found to develop surprising internal specializations: Individual neurons, attention heads, and circuits become selectively sensitive to syntactic structure, reflecting patterns observed in the human brain. While this specialization is well-documented, how it emerges during training and what influences its development remains largely unknown. In this work, we tap into the black box of specialization by tracking its formation over time. By quantifying internal syntactic consistency across minimal pairs from various syntactic phenomena, we identify a clear developmental trajectory: Syntactic sensitivity emerges gradually, concentrates in specific layers, and exhibits a 'critical period' of rapid internal specialization. This process is consistent across architectures and initialization parameters (e.g., random seeds), and is influenced by model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsSoftmax · Attention Is All You Need