A Novel Architecture Slimming Method for Network Pruning and Knowledge   Distillation

Dongqi Wang; Shengyu Zhang; Zhipeng Di; Xin Lin; Weihua; Zhou; Fei Wu

arXiv:2202.10461·cs.CV·February 23, 2022

A Novel Architecture Slimming Method for Network Pruning and Knowledge Distillation

Dongqi Wang, Shengyu Zhang, Zhipeng Di, Xin Lin, Weihua, Zhou, Fei Wu

PDF

Open Access

TL;DR

This paper introduces an automated architecture slimming method that optimizes layer configurations for network pruning and knowledge distillation, reducing manual effort and improving performance at similar compression rates.

Contribution

It formulates architecture determination as a PCA-based linear transformation, enabling automatic, effective layer-wise compression without extensive experimentation.

Findings

01

Significant performance gains over baselines at the same compression rate

02

Layer-wise compression rates align with known layer sensitivities

03

Method reduces human intervention in model compression processes

Abstract

Network pruning and knowledge distillation are two widely-known model compression methods that efficiently reduce computation cost and model size. A common problem in both pruning and distillation is to determine compressed architecture, i.e., the exact number of filters per layer and layer configuration, in order to preserve most of the original model capacity. In spite of the great advances in existing works, the determination of an excellent architecture still requires human interference or tremendous experimentations. In this paper, we propose an architecture slimming method that automates the layer configuration process. We start from the perspective that the capacity of the over-parameterized model can be largely preserved by finding the minimum number of filters preserving the maximum parameter variance per layer, resulting in a thin architecture. We formulate the determination…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Neural Networks and Reservoir Computing · Domain Adaptation and Few-Shot Learning

MethodsPruning · Knowledge Distillation