Distribution-dependent Generalization Bounds for Tuning Linear Regression Across Tasks
Maria-Florina Balcan, Saumya Goyal, Dravyansh Sharma

TL;DR
This paper derives distribution-dependent generalization bounds for tuning regularization hyperparameters in multi-task linear regression, showing improved bounds under certain distributional assumptions, especially in high-dimensional settings.
Contribution
It introduces sharper, distribution-dependent bounds for hyperparameter tuning in multi-task linear regression, outperforming prior uniform bounds in high-dimensional regimes.
Findings
Bounds improve with data distribution 'niceness' and do not worsen with increasing feature dimension d.
Under sub-Gaussian assumptions, bounds remain tight and sharp for large d.
Extensions to ridge regression incorporate estimates of the ground truth mean for tighter bounds.
Abstract
Modern regression problems often involve high-dimensional data and a careful tuning of the regularization hyperparameters is crucial to avoid overly complex models that may overfit the training data while guaranteeing desirable properties like effective variable selection. We study the recently introduced direction of tuning regularization hyperparameters in linear regression across multiple related tasks. We obtain distribution-dependent bounds on the generalization error for the validation loss when tuning the L1 and L2 coefficients, including ridge, lasso and the elastic net. In contrast, prior work develops bounds that apply uniformly to all distributions, but such bounds necessarily degrade with feature dimension, d. While these bounds are shown to be tight for worst-case distributions, our bounds improve with the "niceness" of the data distribution. Concretely, we show that under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
