Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks

Ran Dou; Jose Principe

arXiv:2307.15679·cs.LG·July 31, 2023

Dynamic Analysis and an Eigen Initializer for Recurrent Neural Networks

Ran Dou, Jose Principe

PDF

Open Access

TL;DR

This paper analyzes the dynamics of hidden states in recurrent neural networks using eigen decomposition, offering insights into long-term dependency and proposing a new initialization method that improves performance across various tasks.

Contribution

It introduces a novel eigen-based analysis of RNN hidden states and a new initialization method applicable to multiple RNN architectures, enhancing long-term dependency learning.

Findings

01

Eigen analysis explains long-term dependency in RNNs.

02

The proposed initializer outperforms Xavier, Kaiming, IRNN, and sp-RNN initializers.

03

Improved performance on diverse datasets like MNIST and machine translation.

Abstract

In recurrent neural networks, learning long-term dependency is the main difficulty due to the vanishing and exploding gradient problem. Many researchers are dedicated to solving this issue and they proposed many algorithms. Although these algorithms have achieved great success, understanding how the information decays remains an open problem. In this paper, we study the dynamics of the hidden state in recurrent neural networks. We propose a new perspective to analyze the hidden state space based on an eigen decomposition of the weight matrix. We start the analysis by linear state space model and explain the function of preserving information in activation functions. We provide an explanation for long-term dependency based on the eigen analysis. We also point out the different behavior of eigenvalues for regression tasks and classification tasks. From the observations on well-trained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Advanced Memory and Neural Computing · Machine Learning and ELM

MethodsSigmoid Activation · Gated Recurrent Unit · Tanh Activation · Long Short-Term Memory