Group Sparse Regularization for Deep Neural Networks

Simone Scardapane; Danilo Comminiello; Amir Hussain; Aurelio Uncini

arXiv:1607.00485·stat.ML·February 14, 2017

Group Sparse Regularization for Deep Neural Networks

Simone Scardapane, Danilo Comminiello, Amir Hussain, Aurelio Uncini

PDF

1 Repo

TL;DR

This paper introduces a regularization method based on group Lasso to optimize neural network weights, neuron counts, and feature selection simultaneously, resulting in more compact and efficient models.

Contribution

It extends group Lasso to neural networks, enabling joint optimization of weights, neurons, and features in a unified framework using standard optimization routines.

Findings

01

Achieves competitive performance with significantly smaller networks.

02

Effectively reduces input features while maintaining accuracy.

03

Outperforms classical weight decay and Lasso penalties in experiments.

Abstract

In this paper, we consider the joint task of simultaneously optimizing (i) the weights of a deep neural network, (ii) the number of neurons for each hidden layer, and (iii) the subset of active input features (i.e., feature selection). While these problems are generally dealt with separately, we present a simple regularized formulation allowing to solve all three of them in parallel, using standard optimization routines. Specifically, we extend the group Lasso penalty (originated in the linear regression literature) in order to impose group-level sparsity on the network's connections, where each group is defined as the set of outgoing weights from a unit. Depending on the specific case, the weights can be related to an input variable, to a hidden neuron, or to a bias unit, thus performing simultaneously all the aforementioned tasks in order to obtain a compact network. We perform an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://bitbucket.org/ispamm/group-lasso-deep-networks
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsWeight Decay · Linear Regression