Robust Mean Estimation With Auxiliary Samples
Barron Han, Danil Akhtiamov, Reza Ghane, Babak Hassibi

TL;DR
This paper analyzes how auxiliary samples from a related distribution can improve mean estimation accuracy under Wasserstein-2 distance constraints, providing fundamental limits and optimal estimators for the mean square error in high-dimensional settings.
Contribution
It establishes the theoretical limits and optimal estimators for mean estimation using auxiliary samples within Wasserstein-2 bounds, highlighting when auxiliary data is beneficial.
Findings
Auxiliary samples improve MSE when Wasserstein radius is small.
Explicit formulas for worst-case MSE and optimal estimators derived.
Numerical simulations confirm theoretical results in Gaussian models.
Abstract
In data-driven learning and inference tasks, the high cost of acquiring samples from the target distribution often limits performance. A common strategy to mitigate this challenge is to augment the limited target samples with data from a more accessible "auxiliary" distribution. This paper establishes fundamental limits of this approach by analyzing the improvement in the mean square error (MSE) when estimating the mean of the target distribution. Using the Wasserstein-2 metric to quantify the distance between distributions, we derive expressions for the worst-case MSE when samples are drawn (with labels) from both a target distribution and an auxiliary distribution within a specified Wasserstein-2 distance from the target distribution. We explicitly characterize the achievable MSE and the optimal estimator in terms of the problem dimension, the number of samples from the target and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Methods and Inference · Advanced Statistical Process Monitoring
