Reweighting samples under covariate shift using a Wasserstein distance criterion
Julien Reygner (CERMICS, GdR MASCOT-NUM), Adrien Touboul (CERMICS, IRT, SystemX)

TL;DR
This paper proposes an optimal reweighting method based on Wasserstein distance to align empirical distributions under covariate shift, with theoretical guarantees and applications in uncertainty quantification and regression.
Contribution
It introduces a Wasserstein-based reweighting approach that does not require absolute continuity, providing consistency and convergence rates with practical Nearest Neighbors implementation.
Findings
Reweighting minimizes Wasserstein distance between samples.
Method achieves consistency without absolute continuity assumptions.
Applications include uncertainty quantification and generalization error bounds.
Abstract
Considering two random variables with different laws to which we only have access through finite size iid samples, we address how to reweight the first sample so that its empirical distribution converges towards the true law of the second sample as the size of both samples goes to infinity. We study an optimal reweighting that minimizes the Wasserstein distance between the empirical measures of the two samples, and leads to an expression of the weights in terms of Nearest Neighbors. The consistency and some asymptotic convergence rates in terms of expected Wasserstein distance are derived, and do not need the assumption of absolute continuity of one random variable with respect to the other. These results have some application in Uncertainty Quantification for decoupled estimation and in the bound of the generalization error for the Nearest Neighbor Regression under covariate shift.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
