Sparse Networks from Scratch: Faster Training without Losing Performance
Tim Dettmers, Luke Zettlemoyer

TL;DR
This paper introduces sparse momentum, a novel algorithm for sparse neural network training that accelerates training by up to 5.61 times while maintaining dense performance levels across multiple datasets.
Contribution
The paper presents sparse momentum, a new method for sparse learning that improves training speed and performance consistency in deep neural networks.
Findings
Achieves state-of-the-art sparse performance on MNIST, CIFAR-10, and ImageNet.
Provides up to 5.61x faster training without performance loss.
Demonstrates robustness and ease of use across hyperparameters.
Abstract
We demonstrate the possibility of what we call sparse learning: accelerated training of deep neural networks that maintain sparse weights throughout training while achieving dense performance levels. We accomplish this by developing sparse momentum, an algorithm which uses exponentially smoothed gradients (momentum) to identify layers and weights which reduce the error efficiently. Sparse momentum redistributes pruned weights across layers according to the mean momentum magnitude of each layer. Within a layer, sparse momentum grows weights according to the momentum magnitude of zero-valued weights. We demonstrate state-of-the-art sparse performance on MNIST, CIFAR-10, and ImageNet, decreasing the mean error by a relative 8%, 15%, and 6% compared to other sparse algorithms. Furthermore, we show that sparse momentum reliably reproduces dense performance levels while providing up to 5.61x…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
