Sparse Transfer Learning via Winning Lottery Tickets
Rahul Mehta

TL;DR
This paper extends the Lottery Ticket Hypothesis to transfer learning, demonstrating that highly sparse sub-networks can achieve or surpass the accuracy of dense networks across different datasets.
Contribution
It introduces the application of lottery ticket pruning to transfer learning tasks, showing sparse sub-networks retain or improve performance in various settings.
Findings
Sparse sub-networks with 90-95% pruning match or exceed original accuracy.
Pruned networks transfer effectively across datasets like CIFAR-10, SmallNORB, and FashionMNIST.
Sparse transfer learning reduces model complexity while maintaining performance.
Abstract
The recently proposed Lottery Ticket Hypothesis of Frankle and Carbin (2019) suggests that the performance of over-parameterized deep networks is due to the random initialization seeding the network with a small fraction of favorable weights. These weights retain their dominant status throughout training -- in a very real sense, this sub-network "won the lottery" during initialization. The authors find sub-networks via unstructured magnitude pruning with 85-95% of parameters removed that train to the same accuracy as the original network at a similar speed, which they call winning tickets. In this paper, we extend the Lottery Ticket Hypothesis to a variety of transfer learning tasks. We show that sparse sub-networks with approximately 90-95% of weights removed achieve (and often exceed) the accuracy of the original dense network in several realistic settings. We experimentally validate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
MethodsPruning
