Pruning is Optimal for Learning Sparse Features in High-Dimensions

Nuri Mert Vural; Murat A. Erdogdu

arXiv:2406.08658·stat.ML·June 14, 2024

Pruning is Optimal for Learning Sparse Features in High-Dimensions

Nuri Mert Vural, Murat A. Erdogdu

PDF

Open Access

TL;DR

This paper provides a theoretical explanation for why pruning neural networks enhances feature learning in high-dimensional settings, demonstrating that pruned networks can optimally learn sparse models with better sample complexity.

Contribution

It proves that pruning neural networks aligned with the sparsity of the true model improves learning efficiency and establishes CSQ lower bounds showing the optimality of pruned networks in high dimensions.

Findings

01

Pruned networks achieve optimal sample complexity for sparse models.

02

Unpruned, basis-independent methods are suboptimal in high-sparsity regimes.

03

CSQ lower bounds confirm the optimality of pruning in certain settings.

Abstract

While it is commonly observed in practice that pruning networks to a certain level of sparsity can improve the quality of the features, a theoretical explanation of this phenomenon remains elusive. In this work, we investigate this by demonstrating that a broad class of statistical models can be optimally learned using pruned neural networks trained with gradient descent, in high-dimensions. We consider learning both single-index and multi-index models of the form $y = σ^{*} (V^{⊤} x) + ϵ$ , where $σ^{*}$ is a degree- $p$ polynomial, and $V \in \mathbbm R^{d \times r}$ with $r ≪ d$ , is the matrix containing relevant model directions. We assume that $V$ satisfies a certain $ℓ_{q}$ -sparsity condition for matrices and show that pruning neural networks proportional to the sparsity level of $V$ improves their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Face and Expression Recognition · Neural Networks and Applications

MethodsPruning