# Adaptive Learning Rate Clipping Stabilizes Learning

**Authors:** Jeffrey M. Ede, Richard Beanland

arXiv: 1906.09060 · 2020-05-21

## TL;DR

Adaptive Learning Rate Clipping (ALRC) stabilizes neural network training by limiting loss values, improving stability especially with small batches or high loss functions, without affecting gradient distributions.

## Contribution

The paper introduces ALRC, a simple, computationally inexpensive method that enhances training stability across various loss functions and batch sizes without disrupting gradient behavior.

## Key findings

- ALRC reduces errors in unstable quartic loss training on CIFAR-10.
- ALRC decreases unstable errors in electron micrograph completion.
- Stable mean squared error training remains unaffected by ALRC.

## Abstract

Artificial neural network training with stochastic gradient descent can be destabilized by "bad batches" with high losses. This is often problematic for training with small batch sizes, high order loss functions or unstably high learning rates. To stabilize learning, we have developed adaptive learning rate clipping (ALRC) to limit backpropagated losses to a number of standard deviations above their running means. ALRC is designed to complement existing learning algorithms: Our algorithm is computationally inexpensive, can be applied to any loss function or batch size, is robust to hyperparameter choices and does not affect backpropagated gradient distributions. Experiments with CIFAR-10 supersampling show that ALCR decreases errors for unstable mean quartic error training while stable mean squared error training is unaffected. We also show that ALRC decreases unstable mean squared errors for partial scanning transmission electron micrograph completion. Our source code is publicly available at https://github.com/Jeffrey-Ede/ALRC

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1906.09060/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1906.09060/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/1906.09060/full.md

---
Source: https://tomesphere.com/paper/1906.09060