Sparse Deep Learning Models with the $\ell_1$ Regularization

Lixin Shen; Rui Wang; Yuesheng Xu; Mingsong Yan

arXiv:2408.02801·cs.LG·August 7, 2024

Sparse Deep Learning Models with the $\ell_1$ Regularization

Lixin Shen, Rui Wang, Yuesheng Xu, Mingsong Yan

PDF

Open Access

TL;DR

This paper investigates how $\, ext{ extlnot}_1$ regularization influences sparsity in deep neural networks, deriving models and algorithms to control sparsity levels while maintaining approximation accuracy.

Contribution

It introduces a statistical framework for $\, ext{ extlnot}_1$-regularized deep learning models and develops algorithms for selecting regularization parameters to achieve desired sparsity.

Findings

01

Algorithms effectively select regularization parameters for targeted sparsity.

02

Proposed models balance sparsity and approximation accuracy.

03

Numerical experiments validate the methods' effectiveness.

Abstract

Sparse neural networks are highly desirable in deep learning in reducing its complexity. The goal of this paper is to study how choices of regularization parameters influence the sparsity level of learned neural networks. We first derive the $ℓ_{1}$ -norm sparsity-promoting deep learning models including single and multiple regularization parameters models, from a statistical viewpoint. We then characterize the sparsity level of a regularized neural network in terms of the choice of the regularization parameters. Based on the characterizations, we develop iterative algorithms for selecting regularization parameters so that the weight parameters of the resulting deep neural network enjoy prescribed sparsity levels. Numerical experiments are presented to demonstrate the effectiveness of the proposed algorithms in choosing desirable regularization parameters and obtaining corresponding…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNumerical methods in inverse problems · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques