ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting

Xiaohan Ding; Tianxiang Hao; Jianchao Tan; Ji Liu; Jungong Han; Yuchen; Guo; Guiguang Ding

arXiv:2007.03260·cs.LG·August 17, 2021

ResRep: Lossless CNN Pruning via Decoupling Remembering and Forgetting

Xiaohan Ding, Tianxiang Hao, Jianchao Tan, Ji Liu, Jungong Han, Yuchen, Guo, Guiguang Ding

PDF

Open Access 5 Repos

TL;DR

ResRep introduces a lossless CNN pruning method inspired by neurobiology, which decouples remembering and forgetting to achieve high compression without accuracy loss.

Contribution

It proposes a novel re-parameterization approach that enables lossless channel pruning by separating and merging remembering and forgetting components.

Findings

01

Achieves 45% FLOPs reduction on ResNet-50 with no accuracy drop.

02

First to demonstrate lossless pruning at such a high compression ratio.

03

Utilizes a novel update rule with penalty gradients for structured sparsity.

Abstract

We propose ResRep, a novel method for lossless channel pruning (a.k.a. filter pruning), which slims down a CNN by reducing the width (number of output channels) of convolutional layers. Inspired by the neurobiology research about the independence of remembering and forgetting, we propose to re-parameterize a CNN into the remembering parts and forgetting parts, where the former learn to maintain the performance and the latter learn to prune. Via training with regular SGD on the former but a novel update rule with penalty gradients on the latter, we realize structured sparsity. Then we equivalently merge the remembering and forgetting parts into the original architecture with narrower layers. In this sense, ResRep can be viewed as a successful application of Structural Re-parameterization. Such a methodology distinguishes ResRep from the traditional learning-based pruning paradigm that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Image and Signal Denoising Methods · Domain Adaptation and Few-Shot Learning

MethodsPruning · Stochastic Gradient Descent