Implicit Regularization for Group Sparsity
Jiangyuan Li, Thanh V. Nguyen, Chinmay Hegde, Raymond K. W. Wong

TL;DR
This paper introduces a novel neural reparameterization called diagonally grouped linear neural networks, demonstrating that gradient descent inherently promotes structured group sparsity without explicit regularization, with improved theoretical guarantees and practical algorithms.
Contribution
The paper presents a new reparameterization that reveals implicit regularization towards group sparsity, along with theoretical analysis and a new algorithm for sparse linear regression.
Findings
Gradient descent biases solutions towards group sparse structures.
The new analysis achieves minimax-optimal error rates.
In the size-one group case, a new sparse linear regression algorithm is introduced.
Abstract
We study the implicit regularization of gradient descent towards structured sparsity via a novel neural reparameterization, which we call a diagonally grouped linear neural network. We show the following intriguing property of our reparameterization: gradient descent over the squared regression loss, without any explicit regularization, biases towards solutions with a group sparsity structure. In contrast to many existing works in understanding implicit regularization, we prove that our training trajectory cannot be simulated by mirror descent. We analyze the gradient dynamics of the corresponding regression problem in the general noise setting and obtain minimax-optimal error rates. Compared to existing bounds for implicit sparse regularization using diagonal linear networks, our analysis with the new reparameterization shows improved sample complexity. In the degenerate case of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Photoacoustic and Ultrasonic Imaging · Numerical methods in inverse problems
