Projection-Free CNN Pruning via Frank-Wolfe with Momentum: Sparser Models with Less Pretraining
Hamza ElMokhtar Shili, Natasha Patnaik, Isabelle Ruble, Kathryn Jarjoura, Daniel Suarez Aguirre

TL;DR
This paper explores a novel pruning method for CNNs using a Frank-Wolfe algorithm with momentum, achieving sparser and more accurate models with minimal pretraining, thus reducing training time and computational resources.
Contribution
It introduces a Frank-Wolfe based pruning scheme with momentum for CNNs, demonstrating improved sparsity and accuracy with less pretraining compared to traditional methods.
Findings
FW with momentum produces sparser, more accurate CNNs.
Effective after only a few pretraining epochs.
Minimal inference overhead in pruned models.
Abstract
We investigate algorithmic variants of the Frank-Wolfe (FW) optimization method for pruning convolutional neural networks. This is motivated by the "Lottery Ticket Hypothesis", which suggests the existence of smaller sub-networks within larger pre-trained networks that perform comparatively well (if not better). Whilst most literature in this area focuses on Deep Neural Networks more generally, we specifically consider Convolutional Neural Networks for image classification tasks. Building on the hypothesis, we compare simple magnitude-based pruning, a Frank-Wolfe style pruning scheme, and an FW method with momentum on a CNN trained on MNIST. Our experiments track test accuracy, loss, sparsity, and inference time as we vary the dense pre-training budget from 1 to 10 epochs. We find that FW with momentum yields pruned networks that are both sparser and more accurate than the original…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Generative Adversarial Networks and Image Synthesis
