Incremental Training of a Recurrent Neural Network Exploiting a   Multi-Scale Dynamic Memory

Antonio Carta; Alessandro Sperduti; Davide Bacciu

arXiv:2006.16800·cs.LG·July 1, 2020

Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory

Antonio Carta, Alessandro Sperduti, Davide Bacciu

PDF

1 Repo

TL;DR

This paper introduces a novel incremental training method for recurrent neural networks that employs a multi-scale dynamic memory architecture, enhancing their ability to learn long-term dependencies in sequences.

Contribution

It proposes a modular RNN architecture with separate frequency-based modules and an incremental training algorithm to improve long-term sequence learning.

Findings

01

Enhanced long-term dependency capture in RNNs

02

Improved performance on speech recognition tasks

03

Effective multi-scale memory utilization

Abstract

The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be introduced into a neural architecture by an appropriate modularization of the dynamic memory. In this paper we propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. First, we show how to extend the architecture of a simple RNN by separating its hidden state into different modules, each subsampling the network hidden activations at different frequencies. Then, we discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies. Each new module works at a slower frequency than the previous ones and it is initialized to encode the subsampled sequence of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AntonioCarta/mslmn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.