Penalising the biases in norm regularisation enforces sparsity

Etienne Boursier; Nicolas Flammarion

arXiv:2303.01353·stat.ML·April 9, 2025·1 cites

Penalising the biases in norm regularisation enforces sparsity

Etienne Boursier, Nicolas Flammarion

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how penalising bias terms in norm regularisation influences the sparsity and uniqueness of neural network estimators, revealing that bias regularisation enforces sparsity by affecting the total variation of the second derivative.

Contribution

It provides a theoretical analysis linking bias regularisation to sparsity and uniqueness of minimal norm solutions in one-hidden-layer ReLU networks.

Findings

01

Bias regularisation introduces a weighting factor that enforces sparsity.

02

Omitting bias regularisation allows for non-sparse solutions.

03

Regularising biases leads to minimal norm interpolators with fewer kinks.

Abstract

Controlling the parameters' norm often yields good generalisation when training neural networks. Beyond simple intuitions, the relation between regularising parameters' norm and obtained estimators remains theoretically misunderstood. For one hidden ReLU layer networks with unidimensional data, this work shows the parameters' norm required to represent a function is given by the total variation of its second derivative, weighted by a $1 + x^{2}$ factor. Notably, this weighting factor disappears when the norm of bias terms is not regularised. The presence of this additional weighting factor is of utmost significance as it is shown to enforce the uniqueness and sparsity (in the number of kinks) of the minimal norm interpolator. Conversely, omitting the bias' norm allows for non-sparse solutions. Penalising the bias terms in the regularisation, either explicitly or implicitly, thus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

eboursier/penalising_biases
noneOfficial

Videos

Penalising the biases in norm regularisation enforces sparsity· slideslive

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM · Stochastic Gradient Optimization Techniques