Dense for the Price of Sparse: Improved Performance of Sparsely Initialized Networks via a Subspace Offset
Ilan Price, Jared Tanner

TL;DR
This paper introduces a novel 'DCT plus Sparse' layer architecture that enables training highly sparse neural networks with minimal trainable parameters, achieving state-of-the-art accuracy without full network storage or significant computational overhead.
Contribution
The authors propose a new layer architecture that maintains trainability at extreme sparsity levels and simplifies the pruning process at initialization.
Findings
Achieves state-of-the-art accuracy at 0.01% trainable parameters.
Requires only simple heuristics for parameter location determination.
Does not increase storage or significantly impact computational cost.
Abstract
That neural networks may be pruned to high sparsities and retain high accuracy is well established. Recent research efforts focus on pruning immediately after initialization so as to allow the computational savings afforded by sparsity to extend to the training process. In this work, we introduce a new `DCT plus Sparse' layer architecture, which maintains information propagation and trainability even with as little as 0.01% trainable kernel parameters remaining. We show that standard training of networks built with these layers, and pruned at initialization, achieves state-of-the-art accuracy for extreme sparsities on a variety of benchmark network architectures and datasets. Moreover, these results are achieved using only simple heuristics to determine the locations of the trainable parameters in the network, and thus without having to initially store or compute with the full, unpruned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM
MethodsPruning
