A General Decision Theory for Huber's $\epsilon$-Contamination Model
Mengjie Chen, Chao Gao, Zhao Ren

TL;DR
This paper develops a comprehensive decision theory for robust statistics under Huber's contamination model, introducing Scheffé estimate-based methods that adapt to contamination levels and achieve minimax optimal rates across various estimation tasks.
Contribution
It introduces a general decision framework for robust estimation under contamination, utilizing Scheffé estimates for adaptive, minimax optimal estimators in multiple statistical models.
Findings
Achieves minimax rates with optimal contamination dependence.
Develops Scheffé estimate-based testing procedure with optimal error exponent.
Constructs robust estimators for density, regression, and low-rank models.
Abstract
Today's data pose unprecedented challenges to statisticians. It may be incomplete, corrupted or exposed to some unknown source of contamination. We need new methods and theories to grapple with these challenges. Robust estimation is one of the revived fields with potential to accommodate such complexity and glean useful information from modern datasets. Following our recent work on high dimensional robust covariance matrix estimation, we establish a general decision theory for robust statistics under Huber's -contamination model. We propose a solution using Scheff{\'e} estimate to a robust two-point testing problem that leads to the construction of robust estimators adaptive to the proportion of contamination. Applying the general theory, we construct robust estimators for nonparametric density estimation, sparse linear regression and low-rank trace regression. We show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Statistical Methods and Inference · Distributed Sensor Networks and Detection Algorithms
