Density Estimation with Contaminated Data: Minimax Rates and Theory of   Adaptation

Haoyang Liu; Chao Gao

arXiv:1712.07801·math.ST·July 30, 2018

Density Estimation with Contaminated Data: Minimax Rates and Theory of Adaptation

Haoyang Liu, Chao Gao

PDF

Open Access

TL;DR

This paper analyzes the fundamental limits of density estimation under contamination, deriving minimax rates and exploring the costs of adapting to contamination levels and smoothness, with implications for robust statistical inference.

Contribution

It provides the first comprehensive minimax rates for contaminated density estimation and develops adaptation strategies, including variations of Lepski's method, under different contamination models.

Findings

01

Minimax rate characterized by contamination and smoothness parameters

02

Optimal adaptation incurs small or no costs for certain parameters

03

Adaptation to both contamination level and smoothness simultaneously is impossible

Abstract

This paper studies density estimation under pointwise loss in the setting of contamination model. The goal is to estimate $f (x_{0})$ at some $x_{0} \in R$ with i.i.d. observations, $X_{1}, \dots, X_{n} \sim (1 - ϵ) f + ϵ g,$ where $g$ stands for a contamination distribution. In the context of multiple testing, this can be interpreted as estimating the null density at a point. We carefully study the effect of contamination on estimation through the following model indices: contamination proportion $ϵ$ , smoothness of target density $β_{0}$ , smoothness of contamination density $β_{1}$ , and level of contamination $m$ at the point to be estimated, i.e. $g (x_{0}) \leq m$ . It is shown that the minimax rate with respect to the squared error loss is of order $$ [n^{-\frac{2\beta_0}{2\beta_0+1}}]\vee[\epsilon^2(1\wedge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Statistical Process Monitoring · Statistical Methods in Clinical Trials · Statistical Methods and Inference