State Gradients for RNN Memory Analysis
Lyan Verwimp, Hugo Van hamme, Vincent Renkens, Patrick Wambacq

TL;DR
This paper introduces a gradient-based framework to analyze how RNNs, specifically LSTMs, retain and transfer information from input embeddings to hidden states, enabling detailed memory analysis of words and properties.
Contribution
The authors propose a novel gradient decomposition method to quantify and visualize what information RNNs remember from input embeddings over time.
Findings
Identifies embedding directions best preserved in RNN states
Tracks memory of specific words and properties over sequences
Provides insights into RNN information retention mechanisms
Abstract
We present a framework for analyzing what the state in RNNs remembers from its input embeddings. Our approach is inspired by backpropagation, in the sense that we compute the gradients of the states with respect to the input embeddings. The gradient matrix is decomposed with Singular Value Decomposition to analyze which directions in the embedding space are best transferred to the hidden state space, characterized by the largest singular values. We apply our approach to LSTM language models and investigate to what extent and for how long certain classes of words are remembered on average for a certain corpus. Additionally, the extent to which a specific property or relationship is remembered by the RNN can be tracked by comparing a vector characterizing that property with the direction(s) in embedding space that are best preserved in hidden state space.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory
