Snapshot Ensembles: Train 1, get M for free
Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft,, Kilian Q. Weinberger

TL;DR
Snapshot Ensembles train a single neural network with cyclic learning rates to produce multiple effective models along the training path, enabling ensemble benefits without extra training cost.
Contribution
The paper introduces Snapshot Ensembling, a method to generate multiple models from one training run using cyclic learning rates, reducing computational costs.
Findings
Achieves lower error rates than single models on CIFAR datasets.
Consistently outperforms traditional ensembles at no extra training cost.
Compatible with diverse architectures and tasks.
Abstract
Ensembles of neural networks are known to be much more robust and accurate than individual networks. However, training multiple deep networks for model averaging is computationally expensive. In this paper, we propose a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost. We achieve this goal by training a single neural network, converging to several local minima along its optimization path and saving the model parameters. To obtain repeated rapid convergence, we leverage recent work on cyclic learning rate schedules. The resulting technique, which we refer to as Snapshot Ensembling, is simple, yet surprisingly effective. We show in a series of experiments that our approach is compatible with diverse network architectures and learning tasks. It consistently yields lower error rates than state-of-the-art single models at…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
MethodsSnapshot Ensembles: Train 1, get M for free · Concatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Convolution · Average Pooling · Global Average Pooling · Dense Block · Kaiming Initialization · 1x1 Convolution
