Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
Tal Linzen, Emmanuel Dupoux, Yoav Goldberg

TL;DR
This paper investigates whether LSTM neural networks can learn syntax-sensitive dependencies like subject-verb agreement, finding they perform well with supervision but struggle with purely language modeling signals, indicating a need for stronger architectures or supervision.
Contribution
The study demonstrates that LSTMs can learn syntax-sensitive dependencies with explicit supervision but are limited in purely language modeling contexts, highlighting the importance of targeted training.
Findings
LSTMs achieve high accuracy with explicit grammatical supervision.
Errors increase when sequential and structural cues conflict.
Language modeling alone is insufficient for capturing syntax-sensitive dependencies.
Abstract
The success of long short-term memory (LSTM) neural networks in language processing is typically attributed to their ability to capture long-distance statistical regularities. Linguistic regularities are often sensitive to syntactic structure; can such dependencies be captured by LSTMs, which do not have explicit structural representations? We begin addressing this question using number agreement in English subject-verb dependencies. We probe the architecture's grammatical competence both using training objectives with an explicit grammatical target (number prediction, grammaticality judgments) and using language models. In the strongly supervised settings, the LSTM achieved very high overall accuracy (less than 1% errors), but errors increased when sequential and structural information conflicted. The frequency of such errors rose sharply in the language-modeling setting. We conclude…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
