The rise of the lottery heroes: why zero-shot pruning is hard
Enzo Tartaglione

TL;DR
This paper investigates the challenge of zero-shot pruning in deep learning, aiming to identify trainable sub-networks during training to reduce computational costs, and proposes an approach to address this difficulty.
Contribution
It introduces a novel method for zero-shot pruning that attempts to find trainable sub-graphs during training, highlighting the challenges and trade-offs involved.
Findings
The proposed approach shows potential in reducing training complexity.
Common methods often fail in extreme pruning scenarios.
A trade-off exists between accuracy and computational effort in zero-shot pruning.
Abstract
Recent advances in deep learning optimization showed that just a subset of parameters are really necessary to successfully train a model. Potentially, such a discovery has broad impact from the theory to application; however, it is known that finding these trainable sub-network is a typically costly process. This inhibits practical applications: can the learned sub-graph structures in deep learning models be found at training time? In this work we explore such a possibility, observing and motivating why common approaches typically fail in the extreme scenarios of interest, and proposing an approach which potentially enables training with reduced computational effort. The experiments on either challenging architectures and datasets suggest the algorithmic accessibility over such a computational gain, and in particular a trade-off between accuracy achieved and training complexity deployed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
