Hessian-Aware Pruning and Optimal Neural Implant
Shixing Yu, Zhewei Yao, Amir Gholami, Zhen Dong, Sehoon Kim, Michael W, Mahoney, Kurt Keutzer

TL;DR
This paper introduces Hessian-Aware Pruning (HAP) combined with Neural Implant, a novel structured pruning method that uses second-order sensitivity to prune neural networks effectively while maintaining accuracy.
Contribution
The paper proposes a new Hessian-aware pruning method with Neural Implant that improves pruning efficiency and accuracy preservation by using second-order sensitivity metrics.
Findings
Achieves less than 0.1%/0.5% accuracy degradation with over 70%/50% parameter pruning on ResNet models.
Outperforms gradient-based head pruning methods on transformer models with up to 0.8% accuracy gain at 60% pruning.
State-of-the-art results on computer vision and natural language tasks using the proposed method.
Abstract
Pruning is an effective method to reduce the memory footprint and FLOPs associated with neural network models. However, existing structured-pruning methods often result in significant accuracy degradation for moderate pruning levels. To address this problem, we introduce a new Hessian Aware Pruning (HAP) method coupled with a Neural Implant approach that uses second-order sensitivity as a metric for structured pruning. The basic idea is to prune insensitive components and to use a Neural Implant for moderately sensitive components, instead of completely pruning them. For the latter approach, the moderately sensitive components are replaced with with a low rank implant that is smaller and less computationally expensive than the original component. We use the relative Hessian trace to measure sensitivity, as opposed to the magnitude based sensitivity metric commonly used in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Hessian-Aware Pruning and Optimal Neural Implant· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Adversarial Robustness in Machine Learning
MethodsPruning · Attentive Walk-Aggregating Graph Neural Network
