QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for   Deep Learning

Minghan Fu; Fang-Xiang Wu

arXiv:2302.00252·cs.LG·March 13, 2024

QLABGrad: a Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning

Minghan Fu, Fang-Xiang Wu

PDF

Open Access 1 Video

TL;DR

QLABGrad is a novel, hyperparameter-free learning rate adaptation scheme for deep learning that guarantees convergence and outperforms existing methods across multiple architectures and datasets.

Contribution

It introduces QLABGrad, a hyperparameter-free scheme that automatically adapts learning rates with proven convergence guarantees for deep learning.

Findings

01

Outperforms competing schemes on multiple architectures.

02

Proven convergence under Lipschitz condition.

03

Effective across MNIST, CIFAR10, and ImageNet datasets.

Abstract

The learning rate is a critical hyperparameter for deep learning tasks since it determines the extent to which the model parameters are updated during the learning course. However, the choice of learning rates typically depends on empirical judgment, which may not result in satisfactory outcomes without intensive try-and-error experiments. In this study, we propose a novel learning rate adaptation scheme called QLABGrad. Without any user-specified hyperparameter, QLABGrad automatically determines the learning rate by optimizing the Quadratic Loss Approximation-Based (QLAB) function for a given gradient descent direction, where only one extra forward propagation is required. We theoretically prove the convergence of QLABGrad with a smooth Lipschitz condition on the loss function. Experiment results on multiple architectures, including MLP, CNN, and ResNet, on MNIST, CIFAR10, and ImageNet…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning· underline

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Neural Networks and Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Grouped Convolution · Average Pooling · Dense Connections · Residual Block · Channel Shuffle · Kaiming Initialization · Softmax · Convolution · Depthwise Convolution