Implicit Bias of Linear RNNs

Melikasadat Emami; Mojtaba Sahraee-Ardakan; Parthe Pandit; Sundeep; Rangan; Alyson K. Fletcher

arXiv:2101.07833·cs.LG·January 21, 2021·1 cites

Implicit Bias of Linear RNNs

Melikasadat Emami, Mojtaba Sahraee-Ardakan, Parthe Pandit, Sundeep, Rangan, Alyson K. Fletcher

PDF

Open Access 1 Video

TL;DR

This paper rigorously analyzes linear RNNs, revealing an implicit bias towards shorter memory due to their kernel regime behavior, which explains their poor performance on long-term memory tasks.

Contribution

It provides a novel theoretical explanation for the implicit bias in linear RNNs towards short-term memory, using kernel regime analysis.

Findings

01

Linear RNNs are equivalent to weighted 1D convolutional networks.

02

Bias favors elements with smaller time lags, leading to shorter memory.

03

The bias magnitude relates to the variance of the transition kernel at initialization.

Abstract

Contemporary wisdom based on empirical studies suggests that standard recurrent neural networks (RNNs) do not perform well on tasks requiring long-term memory. However, precise reasoning for this behavior is still unknown. This paper provides a rigorous explanation of this property in the special case of linear RNNs. Although this work is limited to linear RNNs, even these systems have traditionally been difficult to analyze due to their non-linear parameterization. Using recently-developed kernel regime analysis, our main result shows that linear RNNs learned from random initializations are functionally equivalent to a certain weighted 1D-convolutional network. Importantly, the weightings in the equivalent model cause an implicit bias to elements with smaller time lags in the convolution and hence, shorter memory. The degree of this bias depends on the variance of the transition kernel…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Implicit Bias of Linear RNNs· slideslive

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Gaussian Processes and Bayesian Inference

MethodsConvolution