Mechanisms for Handling Nested Dependencies in Neural-Network Language Models and Humans
Yair Lakretz, Dieuwke Hupkes, Alessandra Vergallito, Marco Marelli,, Marco Baroni, Stanislas Dehaene

TL;DR
This study investigates whether a neural network trained for language prediction can mimic human sentence processing, especially in handling nested dependencies and agreement, revealing similarities and differences in their mechanisms.
Contribution
The paper demonstrates that a recurrent neural network develops specialized units for syntactic agreement, providing insights into neural mechanisms of language processing and their limitations.
Findings
The network captures local and some long-distance agreement patterns.
It fails to support full recursion and some embedded dependencies.
Humans outperform the model on long-range embedded dependencies.
Abstract
Recursive processing in sentence comprehension is considered a hallmark of human linguistic abilities. However, its underlying neural mechanisms remain largely unknown. We studied whether a modern artificial neural network trained with "deep learning" methods mimics a central aspect of human sentence processing, namely the storing of grammatical number and gender information in working memory and its use in long-distance agreement (e.g., capturing the correct number agreement between subject and verb when they are separated by other phrases). Although the network, a recurrent architecture with Long Short-Term Memory units, was solely trained to predict the next word in a large corpus, analysis showed the emergence of a very sparse set of specialized units that successfully handled local and long-distance syntactic agreement for grammatical number. However, the simulations also showed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
