How I Learned to Stop Worrying and Love Retraining
Max Zimmer, Christoph Spiegel, Sebastian Pokutta

TL;DR
This paper demonstrates that a simple linear learning rate schedule can significantly shorten retraining in neural network pruning, outperforming complex methods and challenging the need to avoid retraining altogether.
Contribution
It introduces an adaptive linear learning rate schedule for retraining, showing that efficient pruning can be achieved within fixed training budgets and questioning traditional pruning practices.
Findings
Shortened retraining phase with linear schedule outperforms complex methods.
Adaptive schedule selection improves pruning efficiency.
Incorporating sparsification into standard training can reduce the need for retraining.
Abstract
Many Neural Network Pruning approaches consist of several iterative training and pruning steps, seemingly losing a significant amount of their performance after pruning and then recovering it in the subsequent retraining phase. Recent works of Renda et al. (2020) and Le & Hua (2021) demonstrate the significance of the learning rate schedule during the retraining phase and propose specific heuristics for choosing such a schedule for IMP (Han et al., 2015). We place these findings in the context of the results of Li et al. (2020) regarding the training of models within a fixed training budget and demonstrate that, consequently, the retraining phase can be massively shortened using a simple linear learning rate schedule. Improving on existing retraining approaches, we additionally propose a method to adaptively select the initial value of the linear schedule. Going a step further, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsNeural Networks and Applications · Advanced Neural Network Applications · Model Reduction and Neural Networks
MethodsPruning
