TL;DR
ECOD is a simple, parameter-free unsupervised outlier detection method that leverages empirical cumulative distribution functions to identify rare, tail-end data points efficiently and interpretably.
Contribution
We introduce ECOD, a novel outlier detection algorithm that is nonparametric, scalable, and outperforms existing methods on multiple benchmarks.
Findings
ECOD outperforms 11 state-of-the-art methods in accuracy.
ECOD is computationally efficient and scalable to large datasets.
ECOD provides interpretable outlier scores based on empirical distributions.
Abstract
Outlier detection refers to the identification of data points that deviate from a general data distribution. Existing unsupervised approaches often suffer from high computational cost, complex hyperparameter tuning, and limited interpretability, especially when working with large, high-dimensional datasets. To address these issues, we present a simple yet effective algorithm called ECOD (Empirical-Cumulative-distribution-based Outlier Detection), which is inspired by the fact that outliers are often the "rare events" that appear in the tails of a distribution. In a nutshell, ECOD first estimates the underlying distribution of the input data in a nonparametric fashion by computing the empirical cumulative distribution per dimension of the data. ECOD then uses these empirical distributions to estimate tail probabilities per dimension for each data point. Finally, ECOD computes an outlier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
