Optimizing generalization on the train set: a novel gradient-based framework to train parameters and hyperparameters simultaneously
Karim Lounici, Katia Meziani, Benjamin Riu

TL;DR
This paper introduces a gradient-based framework that simultaneously trains models and optimizes hyperparameters, improving generalization without prior tuning, applicable to regression and other objectives, with reduced computational complexity.
Contribution
A novel risk measure enabling automatic hyperparameter and feature selection through gradient-based optimization, eliminating the need for manual tuning.
Findings
Significantly reduced runtime compared to benchmarks.
Effective simultaneous training and regularization.
Applicable to multiple objectives like correlation and sparsity.
Abstract
Generalization is a central problem in Machine Learning. Most prediction methods require careful calibration of hyperparameters carried out on a hold-out \textit{validation} dataset to achieve generalization. The main goal of this paper is to present a novel approach based on a new measure of risk that allows us to develop novel fully automatic procedures for generalization. We illustrate the pertinence of this new framework in the regression problem. The main advantages of this new approach are: (i) it can simultaneously train the model and perform regularization in a single run of a gradient-based optimizer on all available data without any previous hyperparameter tuning; (ii) this framework can tackle several additional objectives simultaneously (correlation, sparsity,...) the introduction of regularization parameters. Noticeably, our approach transforms hyperparameter tuning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
MethodsFeature Selection
