Training Neural Networks for and by Interpolation

Leonard Berrada; Andrew Zisserman; M. Pawan Kumar

arXiv:1906.05661·cs.LG·August 4, 2020·6 cites

Training Neural Networks for and by Interpolation

Leonard Berrada, Andrew Zisserman, M. Pawan Kumar

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces ALI-G, an adaptive optimization algorithm that leverages neural network interpolation properties to automatically set learning rates, achieving state-of-the-art results with minimal tuning across various architectures and datasets.

Contribution

The paper proposes ALI-G, a novel interpolation-based adaptive optimizer that simplifies tuning and matches or surpasses existing methods' performance in deep learning tasks.

Findings

01

ALI-G achieves state-of-the-art results among adaptive methods.

02

It performs comparably to SGD without requiring learning-rate decay schedules.

03

ALI-G is simple to implement and versatile across architectures and datasets.

Abstract

In modern supervised learning, many deep neural networks are able to interpolate the data: the empirical loss can be driven to near zero on all samples simultaneously. In this work, we explicitly exploit this interpolation property for the design of a new optimization algorithm for deep learning, which we term Adaptive Learning-rates for Interpolation with Gradients (ALI-G). ALI-G retains the two main advantages of Stochastic Gradient Descent (SGD), which are (i) a low computational cost per iteration and (ii) good generalization performance in practice. At each iteration, ALI-G exploits the interpolation property to compute an adaptive learning-rate in closed form. In addition, ALI-G clips the learning-rate to a maximal value, which we prove to be helpful for non-convex problems. Crucially, in contrast to the learning-rate of SGD, the maximal learning-rate of ALI-G does not require a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oval-group/ali-g
pytorchOfficial

Videos

Training Neural Networks for and by Interpolation· slideslive

Taxonomy

TopicsNeural Networks and Applications · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks

MethodsAdam · Stochastic Gradient Descent