Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
Andy Li, Aiden Durrant, Milan Markovic, Tianjin Huang, Souvik Kundu, Tianlong Chen, Lu Yin, Georgios Leontidis

TL;DR
This paper introduces novel techniques for training neural networks at extreme sparsity levels (up to 99.99%), enabling models to maintain high accuracy through dynamic training strategies and parameter sharing.
Contribution
The authors develop and validate three new methods—Dynamic ReLU phasing, weight sharing, and cyclic sparsity—that enable stable training of highly sparse neural networks.
Findings
Achieved 99.99% sparsity with maintained accuracy on ResNet architectures.
Demonstrated improved performance over existing methods at extreme sparsity levels.
Validated techniques on CIFAR-10, CIFAR-100, and ImageNet datasets.
Abstract
Pruning of deep neural networks has been an effective technique for reducing model size while preserving most of the performance of dense networks, crucial for deploying models on memory and power-constrained devices. While recent sparse learning methods have shown promising performance up to moderate sparsity levels such as 95% and 98%, accuracy quickly deteriorates when pushing sparsities to extreme levels due to unique challenges such as fragile gradient flow. In this work, we explore network performance beyond the commonly studied sparsities, and develop techniques that encourage stable training without accuracy collapse even at extreme sparsities, including 99.90%, 99.95\% and 99.99% on ResNet architectures. We propose three complementary techniques that enhance sparse training through different mechanisms: 1) Dynamic ReLU phasing, where DyReLU initially allows for richer parameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeotechnical and Geomechanical Engineering · Mechanics and Biomechanics Studies
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Convolution · Global Average Pooling · Kaiming Initialization · Max Pooling
