Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition
Yuanshi Liu, Haihan Zhang, Qian Chen, Cong Fang

TL;DR
This paper investigates the fundamental limits of linear regression under covariate shift, identifying optimal algorithms and conditions for SGD to achieve these bounds, with a focus on preconditioning techniques.
Contribution
It establishes the min-max optimal estimator under covariate shift and characterizes when SGD algorithms can attain this optimality, highlighting the importance of preconditioning.
Findings
Optimal estimator is a linear transformation of the source estimator.
Efficient convex program computes the transformation given source and target matrices.
Conditions identified under which SGD and its variants are optimal.
Abstract
A common pursuit in modern statistical learning is to attain satisfactory generalization out of the source data distribution (OOD). In theory, the challenge remains unsolved even under the canonical setting of covariate shift for the linear model. This paper studies the foundational (high-dimensional) linear regression where the ground truth variables are confined to an ellipse-shape constraint and addresses two fundamental questions in this regime: (i) given the target covariate matrix, what is the min-max \emph{optimal} algorithm under covariate shift? (ii) for what kinds of target classes, the commonly-used SGD-type algorithms achieve optimality? Our analysis starts with establishing a tight lower generalization bound via a Bayesian Cramer-Rao inequality. For (i), we prove that the optimal estimator can be simply a certain linear transformation of the best estimator for the source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and numerical algorithms · Face and Expression Recognition · Neural Networks and Applications
MethodsLinear Regression · Stochastic Gradient Descent
