The Elastic Lottery Ticket Hypothesis
Xiaohan Chen, Yu Cheng, Shuohang Wang, Zhe Gan, Jingjing Liu,, Zhangyang Wang

TL;DR
The paper introduces the Elastic Lottery Ticket Hypothesis, proposing that winning tickets can be transformed across different network architectures within the same family, reducing the need for costly pruning procedures.
Contribution
It presents a novel hypothesis that allows transferring winning tickets between networks by layer reordering and scaling, enabling efficient sparse subnetwork identification.
Findings
Winning tickets can be stretched or squeezed across network depths.
Transformations maintain performance close to directly found tickets.
E-LTH generalizes across model families, layers, and datasets.
Abstract
Lottery Ticket Hypothesis (LTH) raises keen attention to identifying sparse trainable subnetworks, or winning tickets, which can be trained in isolation to achieve similar or even better performance compared to the full models. Despite many efforts being made, the most effective method to identify such winning tickets is still Iterative Magnitude-based Pruning (IMP), which is computationally expensive and has to be run thoroughly for every different network. A natural question that comes in is: can we "transform" the winning ticket found in one network to another with a different architecture, yielding a winning ticket for the latter at the beginning, without re-doing the expensive IMP? Answering this question is not only practically relevant for efficient "once-for-all" winning ticket finding, but also theoretically appealing for uncovering inherently scalable sparse patterns in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsPruning
