Compression-aware Training of Neural Networks using Frank-Wolfe
Max Zimmer, Christoph Spiegel, Sebastian Pokutta

TL;DR
This paper introduces a compression-aware training framework using the Stochastic Frank-Wolfe algorithm and norm constraints, producing dense neural networks that are robust to various compression techniques without retraining.
Contribution
It presents a versatile framework combining norm constraints and SFW for robust neural network training, outperforming existing methods and reducing computational costs for low-rank approximations.
Findings
Outperforms existing compression-aware methods
Requires less computation for low-rank matrix decomposition
Dynamic learning rate adjustment is crucial for convergence
Abstract
Many existing Neural Network pruning approaches rely on either retraining or inducing a strong bias in order to converge to a sparse solution throughout training. A third paradigm, 'compression-aware' training, aims to obtain state-of-the-art dense models that are robust to a wide range of compression ratios using a single dense training run while also avoiding retraining. We propose a framework centered around a versatile family of norm constraints and the Stochastic Frank-Wolfe (SFW) algorithm that encourage convergence to well-performing solutions while inducing robustness towards convolutional filter pruning and low-rank matrix decomposition. Our method is able to outperform existing compression-aware approaches and, in the case of low-rank matrix decomposition, it also requires significantly less computational resources than approaches based on nuclear-norm regularization. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Neural Networks and Applications · Machine Learning and Data Classification
MethodsPruning
