TL;DR
This paper demonstrates that RNNs trained on language modeling can learn to understand hierarchical syntactic structures, effectively predicting long-distance agreement across multiple languages and aligning closely with human performance.
Contribution
It provides evidence that RNNs can acquire deep grammatical knowledge, not just surface patterns, in multilingual settings through language modeling tasks.
Findings
RNNs reliably predict long-distance agreement in multiple languages.
RNN performance approaches human accuracy on syntactic tasks.
RNNs exhibit understanding of hierarchical syntactic structures.
Abstract
Recurrent neural networks (RNNs) have achieved impressive results in a variety of linguistic processing tasks, suggesting that they can induce non-trivial properties of language. We investigate here to what extent RNNs learn to track abstract hierarchical syntactic structure. We test whether RNNs trained with a generic language modeling objective in four languages (Italian, English, Hebrew, Russian) can predict long-distance number agreement in various constructions. We include in our evaluation nonsensical sentences where RNNs cannot rely on semantic or lexical cues ("The colorless green ideas I ate with the chair sleep furiously"), and, for Italian, we compare model performance to human intuitions. Our language-model-trained RNNs make reliable predictions about long-distance agreement, and do not lag much behind human performance. We thus bring support to the hypothesis that RNNs are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
