Strong Lottery Ticket Hypothesis with $\varepsilon$--perturbation

Zheyang Xiong; Fangshuo Liao; Anastasios Kyrillidis

arXiv:2210.16589·cs.LG·November 1, 2022

Strong Lottery Ticket Hypothesis with $\varepsilon$--perturbation

Zheyang Xiong, Fangshuo Liao, Anastasios Kyrillidis

PDF

Open Access

TL;DR

This paper extends the strong Lottery Ticket Hypothesis by incorporating $\varepsilon$-perturbations, reducing over-parameterization needs and demonstrating that SGD-perturbed weights improve pruning performance.

Contribution

It generalizes the theoretical framework of the strong LTH to include perturbations, lowering the over-parameterization threshold and linking SGD weight changes to effective perturbations.

Findings

01

$\varepsilon$-perturbation reduces over-parameterization requirements.

02

SGD-perturbed weights outperform unperturbed in pruning tasks.

03

Theoretical extension of subset sum to neural network weights.

Abstract

The strong Lottery Ticket Hypothesis (LTH) claims the existence of a subnetwork in a sufficiently large, randomly initialized neural network that approximates some target neural network without the need of training. We extend the theoretical guarantee of the strong LTH literature to a scenario more similar to the original LTH, by generalizing the weight change in the pre-training step to some perturbation around initialization. In particular, we focus on the following open questions: By allowing an $ε$ -scale perturbation on the random initial weights, can we reduce the over-parameterization requirement for the candidate network in the strong LTH? Furthermore, does the weight change by SGD coincide with a good set of such perturbation? We answer the first question by first extending the theoretical result on subset sum to allow perturbation on the candidates. Applying this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Machine Learning and ELM

MethodsStochastic Gradient Descent