Trimmed sample means for robust uniform mean estimation and regression
Roberto I. Oliveira, Lucas Resende

TL;DR
This paper demonstrates that trimmed sample means provide robust and optimal estimators for uniform mean estimation and regression under heavy tails and data contamination, outperforming traditional methods in synthetic experiments.
Contribution
It introduces trimmed mean-based estimators for uniform mean estimation and regression that achieve optimal dependence on contamination levels and improve robustness against heavy tails.
Findings
Trimmed mean estimators achieve optimal contamination dependence.
They outperform OLS and median-of-means in synthetic data.
The methods provide uniform error bounds for function expectations.
Abstract
It is well-known that trimmed sample means are robust against heavy tails and data contamination. This paper analyzes the performance of trimmed means and related methods in two novel contexts. The first one consists of estimating expectations of functions in a given family, with uniform error bounds; this is closely related to the problem of estimating the mean of a random vector under a general norm. The second problem considered is that of regression with quadratic loss. In both cases, trimmed-mean-based estimators are the first to obtain optimal dependence on the (adversarial) contamination level. Moreover, they also match or improve upon the state of the art in terms of heavy tails. Experiments with synthetic data show that a natural ``trimmed mean linear regression'' method often performs better than both ordinary least squares and alternative methods based on median-of-means.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Survey Sampling and Estimation Techniques · Statistical Methods and Inference
