Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift
Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin

TL;DR
This paper proves that classical Maximum Likelihood Estimation (MLE) is minimax optimal for out-of-distribution generalization under covariate shift in well-specified models, highlighting its effectiveness without modifications.
Contribution
It establishes that MLE alone suffices for optimal covariate shift generalization in well-specified models, extending to various parametric models without density ratio bounds.
Findings
MLE achieves minimax optimality under covariate shift in well-specified models.
The result applies to linear regression, logistic regression, and phase retrieval.
In misspecified models, MLE is suboptimal, but MWLE can be minimax optimal.
Abstract
A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization -- generalizing to target data whose distribution differs from that of source data. Despite its significant importance, the fundamental question of ``what are the most effective algorithms for OOD generalization'' remains open even under the standard setting of covariate shift. This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the well-specified setting. That is, no algorithm performs better than MLE in this setting (up to a constant factor), justifying MLE is all you need. Our result holds for a very rich class of parametric models, and does not require any boundedness condition on the density ratio.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Inference · Machine Learning and Algorithms
