Winning the Lottery by Preserving Network Training Dynamics with Concrete Ticket Search
Tanay Arora, Christof Teuscher

TL;DR
This paper introduces Concrete Ticket Search (CTS), a novel efficient method for finding high-quality sparse subnetworks in neural networks that outperform existing techniques like LTR and PaI, especially at high sparsity levels.
Contribution
The paper proposes CTS, a combinatorial optimization approach with a Concrete relaxation and gradient balancing, improving subnetwork discovery without extensive hyperparameter tuning.
Findings
CTS achieves high sparsity with competitive accuracy.
CTS outperforms saliency-based pruning methods across all sparsities.
CTS significantly reduces computation time compared to LTR.
Abstract
The Lottery Ticket Hypothesis asserts the existence of highly sparse, trainable subnetworks ('winning tickets') within dense, randomly initialized neural networks. However, state-of-the-art methods of drawing these tickets, like Lottery Ticket Rewinding (LTR), are computationally prohibitive, while more efficient saliency-based Pruning-at-Initialization (PaI) techniques suffer from a significant accuracy-sparsity trade-off and fail basic sanity checks. In this work, we argue that PaI's reliance on first-order saliency metrics, which ignore inter-weight dependencies, contributes substantially to this performance gap, especially in the sparse regime. To address this, we introduce Concrete Ticket Search (CTS), an algorithm that frames subnetwork discovery as a holistic combinatorial optimization problem. By leveraging a Concrete relaxation of the discrete search space and a novel gradient…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neural Network Applications · Multimodal Machine Learning Applications
