The EarlyBird Gets the WORM: Heuristically Accelerating EarlyBird Convergence
Adithya Vasudev

TL;DR
This paper introduces WORM, a novel method that accelerates the discovery of lottery tickets in neural networks by exploiting static neuron groups, leading to faster training, less accuracy loss, and better generalization to larger models.
Contribution
WORM extends the Early Bird hypothesis by utilizing static neuron groups to improve convergence speed and robustness across various neural network architectures.
Findings
WORM finds lottery tickets faster than EarlyBird.
WORM-pruned models retain more accuracy during pruning.
WORM generalizes well to larger models like transformers.
Abstract
The Lottery Ticket hypothesis proposes that ideal, sparse subnetworks, called lottery tickets, exist in untrained dense neural networks. The Early Bird hypothesis proposes an efficient algorithm to find these winning lottery tickets in convolutional neural networks, using the novel concept of distance between subnetworks to detect convergence in the subnetworks of a model. However, this approach overlooks unchanging groups of unimportant neurons near the search's end. We proposes WORM, a method that exploits these static groups by truncating their gradients, forcing the model to rely on other neurons. Experiments show WORM achieves faster ticket identification during training on convolutional neural networks, despite the additional computational overhead, when compared to EarlyBird search. Additionally, WORM-pruned models lose less accuracy during pruning and recover accuracy faster,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis
MethodsPruning
