Optimal Algorithms in Linear Regression under Covariate Shift: On the   Importance of Precondition

Yuanshi Liu; Haihan Zhang; Qian Chen; Cong Fang

arXiv:2502.09047·stat.ML·February 14, 2025

Optimal Algorithms in Linear Regression under Covariate Shift: On the Importance of Precondition

Yuanshi Liu, Haihan Zhang, Qian Chen, Cong Fang

PDF

Open Access

TL;DR

This paper investigates the fundamental limits of linear regression under covariate shift, identifying optimal algorithms and conditions for SGD to achieve these bounds, with a focus on preconditioning techniques.

Contribution

It establishes the min-max optimal estimator under covariate shift and characterizes when SGD algorithms can attain this optimality, highlighting the importance of preconditioning.

Findings

01

Optimal estimator is a linear transformation of the source estimator.

02

Efficient convex program computes the transformation given source and target matrices.

03

Conditions identified under which SGD and its variants are optimal.

Abstract

A common pursuit in modern statistical learning is to attain satisfactory generalization out of the source data distribution (OOD). In theory, the challenge remains unsolved even under the canonical setting of covariate shift for the linear model. This paper studies the foundational (high-dimensional) linear regression where the ground truth variables are confined to an ellipse-shape constraint and addresses two fundamental questions in this regime: (i) given the target covariate matrix, what is the min-max \emph{optimal} algorithm under covariate shift? (ii) for what kinds of target classes, the commonly-used SGD-type algorithms achieve optimality? Our analysis starts with establishing a tight lower generalization bound via a Bayesian Cramer-Rao inequality. For (i), we prove that the optimal estimator can be simply a certain linear transformation of the best estimator for the source…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and numerical algorithms · Face and Expression Recognition · Neural Networks and Applications

MethodsLinear Regression · Stochastic Gradient Descent