When Layers Play the Lottery, all Tickets Win at Initialization

Artur Jordao; George Correa de Araujo; Helena de Almeida Maia; and Helio Pedrini

arXiv:2301.10835·cs.LG·March 20, 2024·1 cites

When Layers Play the Lottery, all Tickets Win at Initialization

Artur Jordao, George Correa de Araujo, Helena de Almeida Maia, and Helio Pedrini

PDF

Open Access 1 Repo

TL;DR

This paper explores layer pruning in neural networks, demonstrating that sparse subnetworks, or winning tickets, can be identified at initialization without training, leading to faster, greener, and more robust AI models.

Contribution

It introduces a novel layer pruning approach to find winning tickets at initialization, reducing training costs and improving robustness compared to traditional filter pruning methods.

Findings

01

Winning tickets exist when layers are pruned.

02

Layer-based winning tickets speed up training and reduce carbon emissions.

03

Subnetworks from layer pruning are more robust against adversarial and out-of-distribution data.

Abstract

Pruning is a standard technique for reducing the computational cost of deep networks. Many advances in pruning leverage concepts from the Lottery Ticket Hypothesis (LTH). LTH reveals that inside a trained dense network exists sparse subnetworks (tickets) able to achieve similar accuracy (i.e., win the lottery - winning tickets). Pruning at initialization focuses on finding winning tickets without training a dense network. Studies on these concepts share the trend that subnetworks come from weight or filter pruning. In this work, we investigate LTH and pruning at initialization from the lens of layer pruning. First, we confirm the existence of winning tickets when the pruning process removes layers. Leveraged by this observation, we propose to discover these winning tickets at initialization, eliminating the requirement of heavy computational resources for training the initial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

arturjordao/layerlottery
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning

MethodsPruning · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings