Exact minimax risk for linear least squares, and the lower tail of sample covariance matrices
Jaouad Mourtada

TL;DR
This paper precisely characterizes the minimax risk in linear least squares prediction, linking it to covariate distribution and leverage scores, and provides sharp bounds on the lower tail of sample covariance matrices.
Contribution
It derives exact minimax risk formulas for linear regression, relates risk to leverage scores, and establishes bounds on the lower tail of covariance matrices using PAC-Bayes techniques.
Findings
Exact minimax risk expressed via leverage scores
Lower bound of d/(n-d+1) for covariate distributions
Sharp nonasymptotic bounds under regularity conditions
Abstract
We consider random-design linear prediction and related questions on the lower tail of random matrices. It is known that, under boundedness constraints, the minimax risk is of order in dimension with samples. Here, we study the minimax expected excess risk over the full linear class, depending on the distribution of covariates. First, the least squares estimator is exactly minimax optimal in the well-specified case, for every distribution of covariates. We express the minimax risk in terms of the distribution of statistical leverage scores of individual samples, and deduce a minimax lower bound of for any covariate distribution, nearly matching the risk for Gaussian design. We then obtain sharp nonasymptotic upper bounds for covariates that satisfy a "small ball"-type regularity condition in both well-specified and misspecified cases. Our main technical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Statistical Methods and Inference · Random Matrices and Applications
