Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization
Antonio Carta, Alessandro Sperduti, Davide Bacciu

TL;DR
This paper proposes a novel autoencoder-based initialization method for RNNs that enhances short-term memory and gradient flow, improving performance on long sequence tasks without relying on backpropagation for pretraining.
Contribution
It introduces a linear autoencoder pretraining schema for RNN initialization, enabling better handling of long sequences and gradient propagation during training.
Findings
Lower reconstruction error on long sequences
Improved gradient propagation during finetuning
Enhanced performance on sequential and permuted MNIST
Abstract
Training RNNs to learn long-term dependencies is difficult due to vanishing gradients. We explore an alternative solution based on explicit memorization using linear autoencoders for sequences, which allows to maximize the short-term memory and that can be solved with a closed-form solution without backpropagation. We introduce an initialization schema that pretrains the weights of a recurrent neural network to approximate the linear autoencoder of the input sequences and we show how such pretraining can better support solving hard classification tasks with long sequences. We test our approach on sequential and permuted MNIST. We show that the proposed approach achieves a much lower reconstruction error for long sequences and a better gradient propagation during the finetuning phase.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Reservoir Computing · Domain Adaptation and Few-Shot Learning · Neural Networks and Applications
MethodsSolana Customer Service Number +1-833-534-1729
