On the long-term learning ability of LSTM LMs

Wim Boes; Robbe Van Rompaey; Lyan Verwimp; Joris Pelemans; Hugo Van; hamme; Patrick Wambacq

arXiv:2106.08927·cs.CL·June 17, 2021

On the long-term learning ability of LSTM LMs

Wim Boes, Robbe Van Rompaey, Lyan Verwimp, Joris Pelemans, Hugo Van, hamme, Patrick Wambacq

PDF

Open Access

TL;DR

This paper investigates the long-term learning capabilities of LSTM language models by evaluating a contextual extension and analyzing its impact on sentence- and discourse-level models across text and speech.

Contribution

It introduces a contextual extension based on CBOW for LSTM LMs and analyzes its effectiveness at different levels, revealing insights into their reliance on contextual information.

Findings

01

Sentence-level models with the extension perform comparably to vanilla models.

02

The extension does not improve discourse-level models.

03

Discourse-level LSTM LMs already utilize contextual information effectively.

Abstract

We inspect the long-term learning ability of Long Short-Term Memory language models (LSTM LMs) by evaluating a contextual extension based on the Continuous Bag-of-Words (CBOW) model for both sentence- and discourse-level LSTM LMs and by analyzing its performance. We evaluate on text and speech. Sentence-level models using the long-term contextual module perform comparably to vanilla discourse-level LSTM LMs. On the other hand, the extension does not provide gains for discourse-level models. These findings indicate that discourse-level LSTM LMs already rely on contextual information to perform long-term learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory