Generalized Resilience and Robust Statistics

Banghua Zhu; Jiantao Jiao; Jacob Steinhardt

arXiv:1909.08755·math.ST·December 15, 2020·26 cites

Generalized Resilience and Robust Statistics

Banghua Zhu, Jiantao Jiao, Jacob Steinhardt

PDF

Open Access

TL;DR

This paper extends robust statistics to general Wasserstein perturbations, introducing a resilient estimation framework that encompasses various data corruption types and simplifies existing methods for mean, regression, and covariance estimation.

Contribution

It generalizes the resilience concept to Wasserstein distances, providing new robust estimators with improved finite-sample rates and broad applicability.

Findings

01

Robust estimation is feasible under Wasserstein perturbations with moment or hypercontractive conditions.

02

The proposed estimators simplify and sometimes improve existing results in population and finite-sample contexts.

03

Connections are made to recent GAN-based robust estimation methods.

Abstract

Robust statistics traditionally focuses on outliers, or perturbations in total variation distance. However, a dataset could be corrupted in many other ways, such as systematic measurement errors and missing covariates. We generalize the robust statistics approach to consider perturbations under any Wasserstein distance, and show that robust estimation is possible whenever a distribution's population statistics are robust under a certain family of friendly perturbations. This generalizes a property called resilience previously employed in the special case of mean estimation with outliers. We justify the generalized resilience property by showing that it holds under moment or hypercontractive conditions. Even in the total variation case, these subsume conditions in the literature for mean estimation, regression, and covariance estimation; the resulting analysis simplifies and sometimes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Methods and Models · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning