Instant Soup: Cheap Pruning Ensembles in A Single Pass Can Draw Lottery Tickets from Large Models
Ajay Jaiswal, Shiwei Liu, Tianlong Chen, Ying Ding, Zhangyang Wang

TL;DR
This paper introduces Instant Soup Pruning, a fast and efficient method to generate high-quality subnetworks from large pre-trained models by aggregating multiple weakly trained masks, reducing pruning costs significantly.
Contribution
It proposes a novel pruning approach that replaces iterative magnitude pruning with a quick mask aggregation technique inspired by model soups, enabling efficient lottery ticket subnetwork extraction.
Findings
ISP achieves comparable performance to traditional IMP-based pruning.
It significantly reduces computational costs in large model pruning.
Validated on CLIP and BERT across multiple datasets.
Abstract
Large pre-trained transformers have been receiving explosive attention in the past few years, due to their wide adaptability for numerous downstream applications via fine-tuning, but their exponentially increasing parameter counts are becoming a primary hurdle to even just fine-tune them without industry-standard hardware. Recently, Lottery Ticket Hypothesis (LTH) and its variants, have been exploited to prune these large pre-trained models generating subnetworks that can achieve similar performance as their dense counterparts, but LTH pragmatism is enormously inhibited by repetitive full training and pruning routine of iterative magnitude pruning (IMP) which worsens with increasing model size. Motivated by the recent observations of model soups, which suggest that fine-tuned weights of multiple models can be merged to a better minima, we propose Instant Soup Pruning (ISP) to generate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Advanced Neural Network Applications · Video Analysis and Summarization
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Pruning · Linear Layer · Layer Normalization · Multi-Head Attention · Adam · Weight Decay · Softmax · Residual Connection
