AutoCompress: An Automatic DNN Structured Pruning Framework for Ultra-High Compression Rates
Ning Liu, Xiaolong Ma, Zhiyuan Xu, Yanzhi Wang, Jian Tang, and Jieping Ye

TL;DR
AutoCompress is an automated structured pruning framework for DNNs that achieves ultra-high compression rates and significant inference speedups by combining advanced pruning schemes, an innovative purification step, and a guided heuristic search.
Contribution
It introduces an automatic structured pruning framework that effectively combines pruning schemes, employs an ADMM-based core algorithm with a novel purification step, and replaces deep reinforcement learning with a guided heuristic search.
Findings
Achieves up to 33x pruning rate and 120x parameter reduction at same accuracy.
Demonstrates significant inference speedup on smartphones.
Outperforms prior automatic model compression methods.
Abstract
Structured weight pruning is a representative model compression technique of DNNs to reduce the storage and computation requirements and accelerate inference. An automatic hyperparameter determination process is necessary due to the large number of flexible hyperparameters. This work proposes AutoCompress, an automatic structured pruning framework with the following key performance improvements: (i) effectively incorporate the combination of structured pruning schemes in the automatic process; (ii) adopt the state-of-art ADMM-based structured weight pruning as the core algorithm, and propose an innovative additional purification step for further weight reduction without accuracy loss; and (iii) develop effective heuristic search method enhanced by experience-based guided search, replacing the prior deep reinforcement learning technique which has underlying incompatibility with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
MethodsPruning
