Colorless green recurrent networks dream hierarchically

Kristina Gulordava; Piotr Bojanowski; Edouard Grave; Tal Linzen; Marco; Baroni

arXiv:1803.11138·cs.CL·March 30, 2018

Colorless green recurrent networks dream hierarchically

Kristina Gulordava, Piotr Bojanowski, Edouard Grave, Tal Linzen, Marco, Baroni

PDF

2 Repos

TL;DR

This paper demonstrates that RNNs trained on language modeling can learn to understand hierarchical syntactic structures, effectively predicting long-distance agreement across multiple languages and aligning closely with human performance.

Contribution

It provides evidence that RNNs can acquire deep grammatical knowledge, not just surface patterns, in multilingual settings through language modeling tasks.

Findings

01

RNNs reliably predict long-distance agreement in multiple languages.

02

RNN performance approaches human accuracy on syntactic tasks.

03

RNNs exhibit understanding of hierarchical syntactic structures.

Abstract

Recurrent neural networks (RNNs) have achieved impressive results in a variety of linguistic processing tasks, suggesting that they can induce non-trivial properties of language. We investigate here to what extent RNNs learn to track abstract hierarchical syntactic structure. We test whether RNNs trained with a generic language modeling objective in four languages (Italian, English, Hebrew, Russian) can predict long-distance number agreement in various constructions. We include in our evaluation nonsensical sentences where RNNs cannot rely on semantic or lexical cues ("The colorless green ideas I ate with the chair sleep furiously"), and, for Italian, we compare model performance to human intuitions. Our language-model-trained RNNs make reliable predictions about long-distance agreement, and do not lag much behind human performance. We thus bring support to the hypothesis that RNNs are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.