Snapshot Ensembles: Train 1, get M for free

Gao Huang; Yixuan Li; Geoff Pleiss; Zhuang Liu; John E. Hopcroft,; Kilian Q. Weinberger

arXiv:1704.00109·cs.LG·April 4, 2017·118 cites

Snapshot Ensembles: Train 1, get M for free

Gao Huang, Yixuan Li, Geoff Pleiss, Zhuang Liu, John E. Hopcroft,, Kilian Q. Weinberger

PDF

Open Access 5 Repos

TL;DR

Snapshot Ensembles train a single neural network with cyclic learning rates to produce multiple effective models along the training path, enabling ensemble benefits without extra training cost.

Contribution

The paper introduces Snapshot Ensembling, a method to generate multiple models from one training run using cyclic learning rates, reducing computational costs.

Findings

01

Achieves lower error rates than single models on CIFAR datasets.

02

Consistently outperforms traditional ensembles at no extra training cost.

03

Compatible with diverse architectures and tasks.

Abstract

Ensembles of neural networks are known to be much more robust and accurate than individual networks. However, training multiple deep networks for model averaging is computationally expensive. In this paper, we propose a method to obtain the seemingly contradictory goal of ensembling multiple neural networks at no additional training cost. We achieve this goal by training a single neural network, converging to several local minima along its optimization path and saving the model parameters. To obtain repeated rapid convergence, we leverage recent work on cyclic learning rate schedules. The resulting technique, which we refer to as Snapshot Ensembling, is simple, yet surprisingly effective. We show in a series of experiments that our approach is compatible with diverse network architectures and learning tasks. It consistently yields lower error rates than state-of-the-art single models at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning

MethodsSnapshot Ensembles: Train 1, get M for free · Concatenated Skip Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Convolution · Average Pooling · Global Average Pooling · Dense Block · Kaiming Initialization · 1x1 Convolution