Density Estimation with Contaminated Data: Minimax Rates and Theory of Adaptation
Haoyang Liu, Chao Gao

TL;DR
This paper analyzes the fundamental limits of density estimation under contamination, deriving minimax rates and exploring the costs of adapting to contamination levels and smoothness, with implications for robust statistical inference.
Contribution
It provides the first comprehensive minimax rates for contaminated density estimation and develops adaptation strategies, including variations of Lepski's method, under different contamination models.
Findings
Minimax rate characterized by contamination and smoothness parameters
Optimal adaptation incurs small or no costs for certain parameters
Adaptation to both contamination level and smoothness simultaneously is impossible
Abstract
This paper studies density estimation under pointwise loss in the setting of contamination model. The goal is to estimate at some with i.i.d. observations, where stands for a contamination distribution. In the context of multiple testing, this can be interpreted as estimating the null density at a point. We carefully study the effect of contamination on estimation through the following model indices: contamination proportion , smoothness of target density , smoothness of contamination density , and level of contamination at the point to be estimated, i.e. . It is shown that the minimax rate with respect to the squared error loss is of order $$ [n^{-\frac{2\beta_0}{2\beta_0+1}}]\vee[\epsilon^2(1\wedge…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Process Monitoring · Statistical Methods in Clinical Trials · Statistical Methods and Inference
