Robust Chauvenet Rejection: Powerful, but Easy to Use Outlier Detection for Heavily Contaminated Data Sets
Nicholas Konz, Daniel E. Reichart

TL;DR
This paper introduces a Python implementation of Robust Chauvenet Rejection (RCR), an effective outlier detection method for heavily contaminated data sets, demonstrating its accuracy, speed, and versatility in one- and multi-dimensional contexts.
Contribution
The paper presents a new Python package for RCR, enhancing its accessibility and usability while maintaining its high performance for outlier rejection in contaminated data.
Findings
RCR effectively cleans heavily contaminated data sets.
The Python implementation maintains the speed of the original C++ version.
RCR performs well in both one-dimensional and multi-dimensional data analysis.
Abstract
In Maples et al. (2018) we introduced Robust Chauvenet Outlier Rejection, or RCR, a novel outlier rejection technique that evolves Chauvenet's Criterion by sequentially applying different measures of central tendency and empirically determining the rejective sigma value. RCR is especially powerful for cleaning heavily-contaminated samples, and unlike other methods such as sigma clipping, it manages to be both accurate and precise when characterizing the underlying uncontaminated distributions of data sets, by using decreasingly robust but increasingly precise statistics in sequence. For this work, we present RCR from a software standpoint, newly implemented as a Python package while maintaining the speed of the C++ original. RCR has been well-tested, calibrated and simulated, and it can be used for both one-dimensional outlier rejection and -dimensional model-fitting, with or without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
