Universality of Winning Tickets: A Renormalization Group Perspective
William T. Redman, Tianlong Chen, Zhangyang Wang, Akshunna S. Dogra

TL;DR
This paper applies renormalization group theory to analyze the universality of winning tickets in neural networks, revealing flow properties and fixed points that explain transferability across tasks and architectures.
Contribution
It introduces a novel theoretical physics framework to study the universality and transferability of winning tickets in neural networks, linking pruning algorithms to renormalization group flows.
Findings
ResNet-50 models exhibit flow properties consistent with renormalization group theory.
BERT models' flows are near fixed points, indicating stability.
Smaller models have more uniform flow properties, affecting transferability.
Abstract
Foundational work on the Lottery Ticket Hypothesis has suggested an exciting corollary: winning tickets found in the context of one task can be transferred to similar tasks, possibly even across different architectures. This has generated broad interest, but methods to study this universality are lacking. We make use of renormalization group theory, a powerful tool from theoretical physics, to address this need. We find that iterative magnitude pruning, the principal algorithm used for discovering winning tickets, is a renormalization group scheme, and can be viewed as inducing a flow in parameter space. We demonstrate that ResNet-50 models with transferable winning tickets have flows with common properties, as would be expected from the theory. Similar observations are made for BERT models, with evidence that their flows are near fixed points. Additionally, we leverage our framework to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Sports Analytics and Performance · Gambling Behavior and Treatments
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · 1x1 Convolution · Average Pooling · Batch Normalization · Residual Connection · WordPiece · Bottleneck Residual Block · Dense Connections
