spred: Solving $L_1$ Penalty with SGD

Liu Ziyin; Zihao Wang

arXiv:2210.01212·cs.LG·July 13, 2023·1 cites

spred: Solving $L_1$ Penalty with SGD

Liu Ziyin, Zihao Wang

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces 'spred', a simple stochastic gradient descent method that directly solves $L_1$ constrained problems through a differentiable reparametrization, enabling effective sparse neural network training and compression.

Contribution

It provides a theoretically grounded, exact differentiable solver for $L_1$ penalties using reparametrization, applicable to nonconvex functions in deep learning.

Findings

01

Effective training of sparse neural networks for gene selection.

02

Successful neural network compression with $L_1$ penalty.

03

Bridges gap between deep learning sparsity and statistical learning.

Abstract

We propose to minimize a generic differentiable objective with $L_{1}$ constraint using a simple reparametrization and straightforward stochastic gradient descent. Our proposal is the direct generalization of previous ideas that the $L_{1}$ penalty may be equivalent to a differentiable reparametrization with weight decay. We prove that the proposed method, \textit{spred}, is an exact differentiable solver of $L_{1}$ and that the reparametrization trick is completely ``benign" for a generic nonconvex function. Practically, we demonstrate the usefulness of the method in (1) training sparse neural networks to perform gene selection tasks, which involves finding relevant features in a very high dimensional space, and (2) neural network compression task, to which previous attempts at applying the $L_{1}$ -penalty have been unsuccessful. Conceptually, our result bridges the gap between the sparsity in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

spred: Solving L1 Penalty with SGD· slideslive

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference

MethodsFeature Selection