Robust Estimation under Heavy Contamination using Enlarged Models
Takafumi Kanamori, Hironori Fujisawa

TL;DR
This paper introduces a robust statistical method using scoring rules and enlarged models to estimate parameters and contamination ratios, effectively detecting outliers even under complex heterogeneous contamination scenarios.
Contribution
It develops a novel approach combining scoring rules with enlarged models to estimate both model parameters and contamination ratios, including in heterogeneous contamination settings.
Findings
Effective outlier detection in contaminated data.
Robust parameter estimation under complex contamination.
Numerical experiments show improved performance over traditional methods.
Abstract
In data analysis, contamination caused by outliers is inevitable, and robust statistical methods are strongly demanded. In this paper, our concern is to develop a new approach for robust data analysis based on scoring rules. The scoring rule is a discrepancy measure to assess the quality of probabilistic forecasts. We propose a simple way of estimating not only the parameter in the statistical model but also the contamination ratio of outliers. Estimating the contamination ratio is important, since one can detect outliers out of the training samples based on the estimated contamination ratio. For this purpose, we use scoring rules with an extended statistical models, that is called the enlarged models. Also, the regression problems are considered. We study a complex heterogeneous contamination, in which the contamination ratio of outliers in the dependent variable may depend on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Advanced Statistical Process Monitoring · Statistical Methods and Inference
