A comparison of LSTM and GRU networks for learning symbolic sequences

Roberto Cahuantzi; Xinye Chen; Stefan G\"uttel

arXiv:2107.02248·cs.LG·November 17, 2023·27 cites

A comparison of LSTM and GRU networks for learning symbolic sequences

Roberto Cahuantzi, Xinye Chen, Stefan G\"uttel

PDF

Open Access 1 Repo

TL;DR

This study compares LSTM and GRU recurrent neural networks in their ability to memorize symbolic sequences of varying complexity, highlighting the impact of hyper-parameters and network depth on learning performance.

Contribution

The paper provides a systematic comparison of LSTM and GRU architectures for symbolic sequence learning, emphasizing the effects of network depth and hyper-parameters.

Findings

01

GRUs outperform LSTMs on low-complexity sequences

02

LSTMs perform better on high-complexity sequences

03

Increasing network depth does not always improve memorization under limited training time

Abstract

We explore the architecture of recurrent neural networks (RNNs) by studying the complexity of string sequences it is able to memorize. Symbolic sequences of different complexity are generated to simulate RNN training and study parameter configurations with a view to the network's capability of learning and inference. We compare Long Short-Term Memory (LSTM) networks and gated recurrent units (GRUs). We find that an increase in RNN depth does not necessarily result in better memorization capability when the training time is constrained. Our results also indicate that the learning rate and the number of units per layer are among the most important hyper-parameters to be tuned. Generally, GRUs outperform LSTM networks on low-complexity sequences while on high-complexity sequences LSTMs perform better.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nla-group/slearn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Computational Physics and Python Applications · Model Reduction and Neural Networks

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory