Loss Surfaces, Mode Connectivity, and Fast Ensembling of DNNs
Timur Garipov, Pavel Izmailov, Dmitrii Podoprikhin, Dmitry Vetrov,, Andrew Gordon Wilson

TL;DR
This paper reveals that deep neural network optima are connected by simple curves with nearly constant accuracy, and introduces a fast ensembling method leveraging this geometric property to improve performance efficiently.
Contribution
The paper uncovers the geometric connectivity of neural network optima and proposes Fast Geometric Ensembling, a novel efficient ensembling technique based on this insight.
Findings
High-accuracy pathways connect neural network optima.
FGE trains high-performing ensembles in the time of a single model.
FGE outperforms Snapshot Ensembles on benchmark datasets.
Abstract
The loss functions of deep neural networks are complex and their geometric properties are not well understood. We show that the optima of these complex loss functions are in fact connected by simple curves over which training and test accuracy are nearly constant. We introduce a training procedure to discover these high-accuracy pathways between modes. Inspired by this new geometric insight, we also propose a new ensembling method entitled Fast Geometric Ensembling (FGE). Using FGE we can train high-performing ensembles in the time required to train a single model. We achieve improved performance compared to the recent state-of-the-art Snapshot Ensembles, on CIFAR-10, CIFAR-100, and ImageNet.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsSnapshot Ensembles: Train 1, get M for free
