Gradient-based Regularization Parameter Selection for Problems with Non-smooth Penalty Functions
Jean Feng, Noah Simon

TL;DR
This paper introduces a gradient-based method for selecting regularization parameters in high-dimensional regression problems with non-smooth penalties, enabling more efficient tuning and improved generalization.
Contribution
It demonstrates that the validation loss is smooth almost everywhere with respect to penalty parameters, allowing gradient-based optimization in non-smooth penalized regression.
Findings
Gradient-based tuning reduces generalization error
Increasing penalty parameters improves model performance
Validation loss is smooth almost everywhere for many penalties
Abstract
In high-dimensional and/or non-parametric regression problems, regularization (or penalization) is used to control model complexity and induce desired structure. Each penalty has a weight parameter that indicates how strongly the structure corresponding to that penalty should be enforced. Typically the parameters are chosen to minimize the error on a separate validation set using a simple grid search or a gradient-free optimization method. It is more efficient to tune parameters if the gradient can be determined, but this is often difficult for problems with non-smooth penalty functions. Here we show that for many penalized regression problems, the validation loss is actually smooth almost-everywhere with respect to the penalty parameters. We can therefore apply a modified gradient descent algorithm to tune parameters. Through simulation studies on example regression problems, we find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Advanced Multi-Objective Optimization Algorithms · Statistical Methods and Inference
