TL;DR
This paper investigates whether large language models develop distinct processing mechanisms for hierarchical and linear grammars, finding evidence of separate components and hierarchy sensitivity even on nonce inputs, independent of meaning.
Contribution
The study demonstrates that LLMs exhibit separate processing mechanisms for hierarchical and linear grammars, with hierarchy-sensitive components active on nonce inputs, indicating an intrinsic structural sensitivity.
Findings
Language models differentiate between hierarchical and linear inputs.
Distinct components are responsible for processing different grammar types.
Hierarchy sensitivity persists even on nonce, non-meaningful inputs.
Abstract
All natural languages are structured hierarchically. In humans, this structural restriction is neurologically coded: when two grammars are presented with identical vocabularies, brain areas responsible for language processing are only sensitive to hierarchical grammars. Using large language models (LLMs), we investigate whether such functionally distinct hierarchical processing regions can arise solely from exposure to large-scale language distributions. We generate inputs using English, Italian, Japanese, or nonce words, varying the underlying grammars to conform to either hierarchical or linear/positional rules. Using these grammars, we first observe that language models show distinct behaviors on hierarchical versus linearly structured inputs. Then, we find that the components responsible for processing hierarchical grammars are distinct from those that process linear grammars; we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
