Boosting Pruned Networks with Linear Over-parameterization

Yu Qian; Jian Cao; Xiaoshuang Li; Jie Zhang; Hufei Li; Jue Chen

arXiv:2204.11444·cs.CV·January 1, 2024

Boosting Pruned Networks with Linear Over-parameterization

Yu Qian, Jian Cao, Xiaoshuang Li, Jie Zhang, Hufei Li, Jue Chen

PDF

Open Access

TL;DR

This paper introduces a method to improve accuracy of pruned neural networks by linearly over-parameterizing layers during fine-tuning and then re-parameterizing them back, enhanced with similarity-preserving knowledge distillation.

Contribution

The paper proposes a novel linear over-parameterization technique combined with knowledge distillation to better restore accuracy in pruned networks after compression.

Findings

01

Significantly outperforms vanilla fine-tuning on CIFAR-10 and ImageNet.

02

Effective especially at large pruning ratios.

03

Enables more accurate fine-tuning of highly compressed networks.

Abstract

Structured pruning compresses neural networks by reducing channels (filters) for fast inference and low footprint at run-time. To restore accuracy after pruning, fine-tuning is usually applied to pruned networks. However, too few remaining parameters in pruned networks inevitably bring a great challenge to fine-tuning to restore accuracy. To address this challenge, we propose a novel method that first linearly over-parameterizes the compact layers in pruned networks to enlarge the number of fine-tuning parameters and then re-parameterizes them to the original layers after fine-tuning. Specifically, we equivalently expand the convolution/linear layer with several consecutive convolution/linear layers that do not alter the current output feature maps. Furthermore, we utilize similarity-preserving knowledge distillation that encourages the over-parameterized block to learn the immediate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Medical Imaging and Analysis · Domain Adaptation and Few-Shot Learning

MethodsPruning · Knowledge Distillation