SAFE: Finding Sparse and Flat Minima to Improve Pruning
Dongyeop Lee, Kwanhee Lee, Jinseok Chung, Namhoon Lee

TL;DR
This paper introduces SAFE, a novel pruning method that finds sparse and flat minima in neural networks, leading to better generalization, robustness, and performance compared to existing approaches.
Contribution
SAFE formulates pruning as a sparsity-constrained optimization encouraging flatness, solved via an augmented Lagrange dual approach, and extends it with a generalized projection for improved pruning.
Findings
SAFE produces sparse networks with better generalization.
SAFE is resilient to noisy data and real-world conditions.
SAFE compares favorably to established pruning baselines.
Abstract
Sparsifying neural networks often suffers from seemingly inevitable performance degradation, and it remains challenging to restore the original performance despite much recent progress. Motivated by recent studies in robust optimization, we aim to tackle this problem by finding subnetworks that are both sparse and flat at the same time. Specifically, we formulate pruning as a sparsity-constrained optimization problem where flatness is encouraged as an objective. We solve it explicitly via an augmented Lagrange dual approach and extend it further by proposing a generalized projection operation, resulting in novel pruning methods called SAFE and its extension, SAFE. Extensive evaluations on standard image classification and language modeling tasks reveal that SAFE consistently yields sparse networks with improved generalization performance, which compares competitively to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Domain Adaptation and Few-Shot Learning
MethodsPruning
