The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Jonathan Frankle; Michael Carbin

arXiv:1803.03635·cs.LG·March 5, 2019·1.3k cites

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks

Jonathan Frankle, Michael Carbin

PDF

Open Access 5 Repos 4 Videos

TL;DR

This paper introduces the lottery ticket hypothesis, showing that sparse subnetworks within dense neural networks, identified by their initial weights, can be trained independently to achieve comparable or better performance, revealing the importance of initializations.

Contribution

The paper proposes the lottery ticket hypothesis and an algorithm to find winning subnetworks, demonstrating their effectiveness in training neural networks more efficiently.

Findings

01

Winning tickets are less than 20% of the original network size.

02

Winning tickets learn faster and reach higher accuracy.

03

Subnetworks with favorable initializations can be trained independently.

Abstract

Neural network pruning techniques can reduce the parameter counts of trained networks by over 90%, decreasing storage requirements and improving computational performance of inference without compromising accuracy. However, contemporary experience is that the sparse architectures produced by pruning are difficult to train from the start, which would similarly improve training performance. We find that a standard pruning technique naturally uncovers subnetworks whose initializations made them capable of training effectively. Based on these results, we articulate the "lottery ticket hypothesis:" dense, randomly-initialized, feed-forward networks contain subnetworks ("winning tickets") that - when trained in isolation - reach test accuracy comparable to the original network in a similar number of iterations. The winning tickets we find have won the initialization lottery: their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks· youtube

Why AI Has a Plato Problem — Mazviita Chirimuuta· youtube

Build Specialist LLMs Like It’s 2019 (Randall Balestriero)· youtube

The Lottery Ticket Hypothesis with Jonathan Frankle· youtube

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning

MethodsPruning