Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

Timur Garipov; Pavel Izmailov; Dmitrii Podoprikhin; Dmitry Vetrov,; Andrew Gordon Wilson

arXiv:1802.10026·stat.ML·October 31, 2018·212 cites

Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs

Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry Vetrov,, Andrew Gordon Wilson

PDF

Open Access 5 Repos

TL;DR

This paper reveals that deep neural network optima are connected by simple curves with nearly constant accuracy, and introduces a fast ensembling method leveraging this geometric property to improve performance efficiently.

Contribution

The paper uncovers the geometric connectivity of neural network optima and proposes Fast Geometric Ensembling, a novel efficient ensembling technique based on this insight.

Findings

01

High-accuracy pathways connect neural network optima.

02

FGE trains high-performing ensembles in the time of a single model.

03

FGE outperforms Snapshot Ensembles on benchmark datasets.

Abstract

The loss functions of deep neural networks are complex and their geometric properties are not well understood. We show that the optima of these complex loss functions are in fact connected by simple curves over which training and test accuracy are nearly constant. We introduce a training procedure to discover these high-accuracy pathways between modes. Inspired by this new geometric insight, we also propose a new ensembling method entitled Fast Geometric Ensembling (FGE). Using FGE we can train high-performing ensembles in the time required to train a single model. We achieve improved performance compared to the recent state-of-the-art Snapshot Ensembles, on CIFAR-10, CIFAR-100, and ImageNet.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning

MethodsSnapshot Ensembles: Train 1, get M for free