# LipschitzLR: Using theoretically computed adaptive learning rates for   fast convergence

**Authors:** Rahul Yedida, Snehanshu Saha, Tejas Prashanth

arXiv: 1902.07399 · 2020-08-04

## TL;DR

This paper introduces LipschitzLR, a method for dynamically computing adaptive learning rates for deep neural network training based on the Lipschitz constant, leading to faster convergence and better hyper-parameter tuning.

## Contribution

The paper presents a theoretical framework for calculating learning rates using Lipschitz constants and extends it to various optimization algorithms, improving training efficiency.

## Key findings

- Adaptive learning rates computed via Lipschitz constants outperform fixed rates.
- Commonly used learning rates are significantly smaller than the theoretically optimal values.
- The approach accelerates convergence across multiple architectures and datasets.

## Abstract

Optimizing deep neural networks is largely thought to be an empirical process, requiring manual tuning of several hyper-parameters, such as learning rate, weight decay, and dropout rate. Arguably, the learning rate is the most important of these to tune, and this has gained more attention in recent works. In this paper, we propose a novel method to compute the learning rate for training deep neural networks with stochastic gradient descent. We first derive a theoretical framework to compute learning rates dynamically based on the Lipschitz constant of the loss function. We then extend this framework to other commonly used optimization algorithms, such as gradient descent with momentum and Adam. We run an extensive set of experiments that demonstrate the efficacy of our approach on popular architectures and datasets, and show that commonly used learning rates are an order of magnitude smaller than the ideal value.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.07399/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/1902.07399/full.md

## References

49 references — full list in the complete paper: https://tomesphere.com/paper/1902.07399/full.md

---
Source: https://tomesphere.com/paper/1902.07399