Selection Plateau and a Sparsity-Dependent Hierarchy of Pruning Features
Guangqi Li,Yongxin Li

TL;DR
This paper uncovers a phenomenon called Selection Plateau in neural network pruning, proposing a sparsity-dependent feature complexity spectrum that explains pruning performance limits and guides future algorithm design.
Contribution
It introduces the SICS hypothesis, linking feature complexity to sparsity levels, and demonstrates its explanatory power across different pruning methods and feature classes.
Findings
All rank-monotone weight scorers converge to the same accuracy at fixed sparsity.
Smooth non-monotone features improve pruning escape at certain sparsities.
Rank-alignment is necessary but not sufficient for effective pruning.
Abstract
We identify a Selection Plateau phenomenon in one-shot neural network pruning: all rank-monotone weight scorers converge to identical accuracy at fixed sparsity, independent of functional form. We propose the Sparsity-Information-Complexity Spectrum (SICS) hypothesis: a sparsity-dependent minimum feature complexity kappa(S) governs plateau escape, with kappa=0 sufficient at low sparsity (S<0.65), kappa=1 dominant at critical sparsity (S~0.7), and kappa=2 necessary at extreme sparsity (S>0.75). On ViT-Small/CIFAR-10, testing nine feature classes across four sparsities, smooth non-monotone features provide +6.6% escape at S=0.7, while only raw features with high-frequency wiggle escape at S=0.8 (+2.6%). A fake non-monotone scorer underperforms the gradient baseline, indicating the requirement is magnitude-independent non-monotonicity. A handcrafted Gaussian bump achieves only +0.006…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
