Challenging the empirical mean and empirical variance: a deviation study
Olivier Catoni

TL;DR
This paper introduces new M-estimators for mean and variance based on PAC-Bayes bounds, demonstrating their robustness and superior deviation properties compared to empirical estimators, especially for heavy-tailed distributions.
Contribution
It proposes novel PAC-Bayes-based M-estimators for mean and variance that outperform empirical estimators under weak distributional assumptions.
Findings
New estimators have deviations comparable to Gaussian samples.
Estimators perform better than empirical mean on heavy-tailed data.
Experimental results show lower deviation quantiles across all levels.
Abstract
We present new M-estimators of the mean and variance of real valued random variables, based on PAC-Bayes bounds. We analyze the non-asymptotic minimax properties of the deviations of those estimators for sample distributions having either a bounded variance or a bounded variance and a bounded kurtosis. Under those weak hypotheses, allowing for heavy-tailed distributions, we show that the worst case deviations of the empirical mean are suboptimal. We prove indeed that for any confidence level, there is some M-estimator whose deviations are of the same order as the deviations of the empirical mean of a Gaussian statistical sample, even when the statistical sample is instead heavy-tailed. Experiments reveal that these new estimators perform even better than predicted by our bounds, showing deviation quantile functions uniformly lower at all probability levels than the empirical mean for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Risk and Portfolio Optimization · Distributed Sensor Networks and Detection Algorithms
