Empirical Evaluation of A New Approach to Simplifying Long Short-term   Memory (LSTM)

Yuzhen Lu

arXiv:1612.03707·cs.NE·December 13, 2016·1 cites

Empirical Evaluation of A New Approach to Simplifying Long Short-term Memory (LSTM)

Yuzhen Lu

PDF

Open Access

TL;DR

This paper empirically compares the standard LSTM with three simplified variants that reduce parameters, demonstrating comparable performance on sequence modeling tasks with adjustments to learning rates.

Contribution

It introduces and evaluates three simplified LSTM variants by removing certain gate components, showing they perform similarly to the standard model.

Findings

01

Simplified LSTM variants achieve comparable accuracy to standard LSTM.

02

Reduced parameter models require tuning of learning rates.

03

Simplifications can reduce complexity without sacrificing performance.

Abstract

The standard LSTM, although it succeeds in the modeling long-range dependences, suffers from a highly complex structure that can be simplified through modifications to its gate units. This paper was to perform an empirical comparison between the standard LSTM and three new simplified variants that were obtained by eliminating input signal, bias and hidden unit signal from individual gates, on the tasks of modeling two sequence datasets. The experiments show that the three variants, with reduced parameters, can achieve comparable performance with the standard LSTM. Due attention should be paid to turning the learning rate to achieve high accuracies

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Topic Modeling

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory