Fitting very flexible models: Linear regression with large numbers of parameters
David W. Hogg (NYU), Soledad Villar (JHU)

TL;DR
This paper explores how highly flexible linear models with many parameters can be effectively regularized and used for data interpolation and denoising, challenging traditional overfitting concerns.
Contribution
It demonstrates that models with more parameters than data points can generalize well when properly regularized, and connects regularized least squares to Gaussian process regression.
Findings
Regularized models with many parameters generalize well.
Cross-validation effectively guides model complexity.
Infinite-parameter limit corresponds to Gaussian process mean.
Abstract
There are many uses for linear fitting; the context here is interpolation and denoising of data, as when you have calibration data and you want to fit a smooth, flexible function to those data. Or you want to fit a flexible function to de-trend a time series or normalize a spectrum. In these contexts, investigators often choose a polynomial basis, or a Fourier basis, or wavelets, or something equally general. They also choose an order, or number of basis functions to fit, and (often) some kind of regularization. We discuss how this basis-function fitting is done, with ordinary least squares and extensions thereof. We emphasize that it is often valuable to choose far more parameters than data points, despite folk rules to the contrary: Suitably regularized models with enormous numbers of parameters generalize well and make good predictions for held-out data; over-fitting is not (mainly)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
