Bifurcations and loss jumps in RNN training

Lukas Eisenmann; Zahra Monfared; Niclas Alexander G\"oring; Daniel; Durstewitz

arXiv:2310.17561·cs.LG·October 27, 2023·2 cites

Bifurcations and loss jumps in RNN training

Lukas Eisenmann, Zahra Monfared, Niclas Alexander G\"oring, Daniel, Durstewitz

PDF

Open Access 2 Videos

TL;DR

This paper investigates bifurcations in ReLU-based RNNs, mathematically links them to loss jumps during training, and introduces an exact algorithm for detecting fixed points and cycles, enhancing understanding of RNN dynamics and training behavior.

Contribution

It provides a mathematical proof connecting bifurcations to loss gradients and introduces a novel, exact heuristic algorithm for detecting fixed points and cycles in ReLU RNNs, improving analysis tools.

Findings

01

Bifurcations are linked to loss jumps in RNN training.

02

The new algorithm accurately finds fixed points and cycles in ReLU RNNs.

03

Generalized teacher forcing avoids certain bifurcations during training.

Abstract

Recurrent neural networks (RNNs) are popular machine learning tools for modeling and forecasting sequential data and for inferring dynamical systems (DS) from observed time series. Concepts from DS theory (DST) have variously been used to further our understanding of both, how trained RNNs solve complex tasks, and the training process itself. Bifurcations are particularly important phenomena in DS, including RNNs, that refer to topological (qualitative) changes in a system's dynamical behavior as one or more of its parameters are varied. Knowing the bifurcation structure of an RNN will thus allow to deduce many of its computational and dynamical properties, like its sensitivity to parameter variations or its behavior during training. In particular, bifurcations may account for sudden loss jumps observed in RNN training that could severely impede the training process. Here we first…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

NeurIPS 2023 Poster Session 2 (Wednesday Morning)· youtube

Bifurcations and loss jumps in RNN training· slideslive

Taxonomy

TopicsModel Reduction and Neural Networks · Neural Networks and Applications