
TL;DR
S$^3$ is an algebraic framework that precisely defines, composes, and implements diverse structured sparsity patterns in neural networks, enabling improved pruning strategies.
Contribution
It introduces a formal, composable framework for structured sparsity specification that integrates with existing pruning methods and demonstrates superior experimental results.
Findings
S$^3$ can specify a wide range of sparsity patterns including N:M and channel pruning.
Structured OBS and OBD implementations based on S$^3$ outperform second-order heuristics.
The framework is mathematically formalized and validated through experiments.
Abstract
We introduce the Structured Sparsity Specification (S), an algebraic framework for defining, composing, and implementing structured sparse patterns. S specifies sparsity through three components: a View that reshapes the tensor via layout composition, a Block specification that defines the atomic pruning unit, and the sparsity decision Scope. Both Block and Scope support Coupling across tensors for coordinated sparsification. S enables precise specification of diverse sparsity structures, from fine-grained N:M patterns to coarse channel pruning, and integrates seamlessly with Optimal Brain Damage (OBD) and Surgeon (OBS). We formalize the framework mathematically, demonstrate its expressiveness on canonical patterns, and validate it experimentally via structured OBS and OBD implementations built entirely on S, which surpasses well-established second order heuristics on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
