Short-Term Memory Optimization in Recurrent Neural Networks by   Autoencoder-based Initialization

Antonio Carta; Alessandro Sperduti; Davide Bacciu

arXiv:2011.02886·cs.LG·November 6, 2020

Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization

Antonio Carta, Alessandro Sperduti, Davide Bacciu

PDF

Open Access 1 Repo

TL;DR

This paper proposes a novel autoencoder-based initialization method for RNNs that enhances short-term memory and gradient flow, improving performance on long sequence tasks without relying on backpropagation for pretraining.

Contribution

It introduces a linear autoencoder pretraining schema for RNN initialization, enabling better handling of long sequences and gradient propagation during training.

Findings

01

Lower reconstruction error on long sequences

02

Improved gradient propagation during finetuning

03

Enhanced performance on sequential and permuted MNIST

Abstract

Training RNNs to learn long-term dependencies is difficult due to vanishing gradients. We explore an alternative solution based on explicit memorization using linear autoencoders for sequences, which allows to maximize the short-term memory and that can be solved with a closed-form solution without backpropagation. We introduce an initialization schema that pretrains the weights of a recurrent neural network to approximate the linear autoencoder of the input sequences and we show how such pretraining can better support solving hard classification tasks with long sequences. We test our approach on sequential and permuted MNIST. We show that the proposed approach achieves a much lower reconstruction error for long sequences and a better gradient propagation during the finetuning phase.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AntonioCarta/rnn_autoencoding_neurips2020
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Reservoir Computing · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications

MethodsSolana Customer Service Number +1-833-534-1729