Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry

Mohammed Adnan; Rohan Jain; Ekansh Sharma; Rahul G. Krishnan; Yani Ioannou

arXiv:2505.05143·cs.LG·August 18, 2025

Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry

Mohammed Adnan, Rohan Jain, Ekansh Sharma, Rahul G. Krishnan, Yani Ioannou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper proposes a method to improve sparse training from random initialization by aligning lottery ticket masks with the current optimization basin through permutation, leading to better generalization across datasets and models.

Contribution

It introduces a permutation-based alignment technique for LTH masks to enhance sparse training from random initializations, addressing generalization issues.

Findings

01

Significant increase in generalization with permuted masks

02

Effective across multiple datasets and models

03

Improved sparse training performance

Abstract

The Lottery Ticket Hypothesis (LTH) suggests there exists a sparse LTH mask and weights that achieve the same generalization performance as the dense model while using significantly fewer parameters. However, finding a LTH solution is computationally expensive, and a LTH sparsity mask does not generalize to other random weight initializations. Recent work has suggested that neural networks trained from random initialization find solutions within the same basin modulo permutation, and proposes a method to align trained models within the same loss basin. We hypothesize that misalignment of basins is the reason why LTH masks do not generalize to new random initializations and propose permuting the LTH mask to align with the new optimization basin when performing sparse training from a different random init. We empirically show a significant increase in generalization when sparse training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

calgaryml/sparse-rebasin
pytorchOfficial

Videos

Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis

MethodsALIGN