Context-Free Transductions with Neural Stacks

Yiding Hao; William Merrill; Dana Angluin; Robert Frank; Noah Amsel,; Andrew Benz; and Simon Mendelsohn

arXiv:1809.02836·cs.NE·September 11, 2018

Context-Free Transductions with Neural Stacks

Yiding Hao, William Merrill, Dana Angluin, Robert Frank, Noah Amsel,, Andrew Benz, and Simon Mendelsohn

PDF

Open Access 2 Repos

TL;DR

This paper investigates stack-augmented RNNs, demonstrating their ability to learn intuitive stack-based strategies for formal language tasks, while also highlighting training difficulties and alternative unstructured memory usage.

Contribution

It provides an analysis of stack RNN behavior on formal language tasks, revealing their capacity for stack-based strategies and comparing their training challenges to classical models.

Findings

01

Stack RNNs can learn stack-based strategies for tasks like string reversal.

02

Training stack RNNs is more difficult than training LSTMs.

03

Complex networks often use the stack as unstructured memory rather than for stack strategies.

Abstract

This paper analyzes the behavior of stack-augmented recurrent neural network (RNN) models. Due to the architectural similarity between stack RNNs and pushdown transducers, we train stack RNN models on a number of tasks, including string reversal, context-free language modelling, and cumulative XOR evaluation. Examining the behavior of our networks, we show that stack-augmented RNNs can discover intuitive stack-based strategies for solving our tasks. However, stack RNNs are more difficult to train than classical architectures such as LSTMs. Rather than employ stack-based strategies, more complex networks often find approximate solutions by using the stack as unstructured memory.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis