Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Mohammed Adnan, Rohan Jain, Ekansh Sharma, Rahul G. Krishnan, Yani Ioannou

TL;DR
This paper proposes a method to improve sparse training from random initialization by aligning lottery ticket masks with the current optimization basin through permutation, leading to better generalization across datasets and models.
Contribution
It introduces a permutation-based alignment technique for LTH masks to enhance sparse training from random initializations, addressing generalization issues.
Findings
Significant increase in generalization with permuted masks
Effective across multiple datasets and models
Improved sparse training performance
Abstract
The Lottery Ticket Hypothesis (LTH) suggests there exists a sparse LTH mask and weights that achieve the same generalization performance as the dense model while using significantly fewer parameters. However, finding a LTH solution is computationally expensive, and a LTH sparsity mask does not generalize to other random weight initializations. Recent work has suggested that neural networks trained from random initialization find solutions within the same basin modulo permutation, and proposes a method to align trained models within the same loss basin. We hypothesize that misalignment of basins is the reason why LTH masks do not generalize to new random initializations and propose permuting the LTH mask to align with the new optimization basin when performing sparse training from a different random init. We empirically show a significant increase in generalization when sparse training…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis
MethodsALIGN
