Anomaly Detection by Robust Statistics

Peter J. Rousseeuw; Mia Hubert

arXiv:1707.09752·stat.ML·January 13, 2021·WIREs Data Mining Knowl. Discov.

Anomaly Detection by Robust Statistics

Peter J. Rousseeuw, Mia Hubert

PDF

TL;DR

This paper reviews robust statistical methods for anomaly detection across various data types, emphasizing their ability to identify outliers that can distort analysis or contain valuable information.

Contribution

It provides an overview of robust techniques for outlier detection in univariate, multivariate, and high-dimensional data, including recent advances like cellwise outliers.

Findings

01

Robust methods effectively detect outliers in diverse data settings.

02

Graphical tools facilitate outlier visualization and analysis.

03

Introduction of cellwise outlier detection as a new challenge.

Abstract

Real data often contain anomalous cases, also known as outliers. These may spoil the resulting analysis but they may also contain valuable information. In either case, the ability to detect such anomalies is essential. A useful tool for this purpose is robust statistics, which aims to detect the outliers by first fitting the majority of the data and then flagging data points that deviate from it. We present an overview of several robust methods and the resulting graphical outlier detection tools. We discuss robust procedures for univariate, low-dimensional, and high-dimensional data, such as estimating location and scatter, linear regression, principal component analysis, classification, clustering, and functional data analysis. Also the challenging new topic of cellwise outliers is introduced.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.