Gradient Flossing: Improving Gradient Descent through Dynamic Control of Jacobians
Rainer Engelken

TL;DR
Gradient flossing is a novel method that stabilizes gradients in RNN training by controlling Lyapunov exponents, leading to improved convergence and ability to handle longer time horizons.
Contribution
We introduce gradient flossing, a new regularization technique that stabilizes gradients by regulating Lyapunov exponents during RNN training, enhancing long-term learning.
Findings
Improves success rate and convergence speed for long-horizon tasks.
Controls gradient norm and Jacobian condition number effectively.
Extends the feasible time horizon for backpropagation through time.
Abstract
Training recurrent neural networks (RNNs) remains a challenge due to the instability of gradients across long time horizons, which can lead to exploding and vanishing gradients. Recent research has linked these problems to the values of Lyapunov exponents for the forward-dynamics, which describe the growth or shrinkage of infinitesimal perturbations. Here, we propose gradient flossing, a novel approach to tackling gradient instability by pushing Lyapunov exponents of the forward dynamics toward zero during learning. We achieve this by regularizing Lyapunov exponents through backpropagation using differentiable linear algebra. This enables us to "floss" the gradients, stabilizing them and thus improving network training. We demonstrate that gradient flossing controls not only the gradient norm but also the condition number of the long-term Jacobian, facilitating multidimensional error…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Neural Networks and Applications · Stochastic Gradient Optimization Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
