On learning an interpreted language with recurrent models

Denis Paperno

arXiv:1809.04128·cs.CL·December 30, 2021

On learning an interpreted language with recurrent models

Denis Paperno

PDF

Open Access 1 Repo

TL;DR

This paper investigates the ability of recurrent neural networks like LSTM and GRU to learn language understanding by testing them on simplified datasets that mimic core properties of natural language, such as recursion and compositionality.

Contribution

It demonstrates that LSTM and GRU models can generalize to compositional interpretation under optimal training conditions, highlighting the importance of curriculum and data directionality.

Findings

01

LSTM and GRU generalize well with proper curriculum

02

Left-to-right composition improves learning

03

Extensive training data enhances generalization

Abstract

Can recurrent neural nets, inspired by human sequential data processing, learn to understand language? We construct simplified datasets reflecting core properties of natural language as modeled in formal syntax and semantics: recursive syntactic structure and compositionality. We find LSTM and GRU networks to generalise to compositional interpretation well, but only in the most favorable learning settings, with a well-paced curriculum, extensive training data, and left-to-right (but not right-to-left) composition.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maradf/wordmeaning
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory