RNNs can generate bounded hierarchical languages with optimal memory

John Hewitt; Michael Hahn; Surya Ganguli; Percy Liang; Christopher D.; Manning

arXiv:2010.07515·cs.CL·October 16, 2020

RNNs can generate bounded hierarchical languages with optimal memory

John Hewitt, Michael Hahn, Surya Ganguli, Percy Liang, Christopher D., Manning

PDF

4 Repos

TL;DR

This paper provides a theoretical analysis showing that RNNs can efficiently generate complex hierarchical languages with significantly less memory than previously thought, matching natural language syntax requirements.

Contribution

The paper proves that RNNs can generate bounded hierarchical languages with optimal memory, reducing the known memory requirements exponentially through an explicit construction.

Findings

01

RNNs can generate bounded hierarchical languages with O(m log k) memory.

02

Theoretical lower bounds show no algorithm can do better than o(m log k) memory.

03

Explicit construction demonstrates the efficiency of RNNs in this task.

Abstract

Recurrent neural networks empirically generate natural language with high syntactic fidelity. However, their success is not well-understood theoretically. We provide theoretical insight into this success, proving in a finite-precision setting that RNNs can efficiently generate bounded hierarchical languages that reflect the scaffolding of natural language syntax. We introduce Dyck-( $k$ , $m$ ), the language of well-nested brackets (of $k$ types) and $m$ -bounded nesting depth, reflecting the bounded memory needs and long-distance dependencies of natural language syntax. The best known results use $O (k^{\frac{m}{2}})$ memory (hidden units) to generate these languages. We prove that an RNN with $O (m lo g k)$ hidden units suffices, an exponential reduction in memory, by an explicit construction. Finally, we show that no algorithm, even with unbounded computation, can suffice with $o (m lo g k)$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.