Generalization Guarantees on Data-Driven Tuning of Gradient Descent with Langevin Updates
Saumya Goyal, Rohith Rongali, Ritabrata Ray, Barnab\'as P\'oczos

TL;DR
This paper introduces LGD, a Langevin-based gradient descent method for hyperparameter tuning in convex regression, providing theoretical guarantees and empirical validation for few-shot learning.
Contribution
It proposes LGD, a novel algorithm with proven optimality and generalization bounds, extending prior work to convex regression and hyperparameter dimensions.
Findings
LGD achieves Bayes' optimal solution for squared loss.
Meta-learning hyperparameters with LGD has a pseudo-dimension bound of O(dh).
Empirical results show LGD's effectiveness in few-shot linear regression.
Abstract
We study learning to learn for regression problems through the lens of hyperparameter tuning. We propose the Langevin Gradient Descent Algorithm (LGD), which approximates the mean of the posterior distribution defined by the loss function and regularizer of a convex regression task. We prove the existence of an optimal hyperparameter configuration for which the LGD algorithm achieves the Bayes' optimal solution for squared loss. Subsequently, we study generalization guarantees on meta-learning optimal hyperparameters for the LGD algorithm from a given set of tasks in the data-driven setting. For a number of parameters and hyperparameter dimension , we show a pseudo-dimension bound of , upto logarithmic terms under mild assumptions on LGD. This matches the dimensional dependence of the bounds obtained in prior work for the elastic net, which only allows for …
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
