On learning an interpreted language with recurrent models
Denis Paperno

TL;DR
This paper investigates the ability of recurrent neural networks like LSTM and GRU to learn language understanding by testing them on simplified datasets that mimic core properties of natural language, such as recursion and compositionality.
Contribution
It demonstrates that LSTM and GRU models can generalize to compositional interpretation under optimal training conditions, highlighting the importance of curriculum and data directionality.
Findings
LSTM and GRU generalize well with proper curriculum
Left-to-right composition improves learning
Extensive training data enhances generalization
Abstract
Can recurrent neural nets, inspired by human sequential data processing, learn to understand language? We construct simplified datasets reflecting core properties of natural language as modeled in formal syntax and semantics: recursive syntactic structure and compositionality. We find LSTM and GRU networks to generalise to compositional interpretation well, but only in the most favorable learning settings, with a well-paced curriculum, extensive training data, and left-to-right (but not right-to-left) composition.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
