Cellwise Outliers
Mia Hubert, Jakob Raymaekers, Peter J. Rousseeuw

TL;DR
This paper reviews recent advances in detecting and handling cellwise outliers in data, emphasizing the need for new techniques that differ from traditional casewise methods, especially in high-dimensional contexts.
Contribution
It provides a comprehensive review of recent progress in robust statistical methods for cellwise outliers across various data analysis techniques.
Findings
Cellwise outliers can significantly contaminate datasets even with few outlying cells.
New techniques are required that relax some traditional properties like equivariance.
High-dimensional data analysis increasingly adopts cellwise robust methods.
Abstract
In statistics and machine learning, the traditional meaning of the terms `outlier' and `anomaly' is a case in the dataset that behaves differently from the bulk of the data. This raises suspicion that it may belong to a different population. But nowadays increasing attention is being paid to so-called cellwise outliers. These are individual values somewhere in the data matrix (or data tensor). Depending on the dimension, even a relatively small proportion of outlying cells can contaminate over half the cases, which is a problem for existing casewise methods. It turns out that detecting cellwise outliers as well as constructing cellwise robust methods requires techniques that are quite different from the casewise setting. For instance, one has to let go of some intuitive equivariance properties. The problem is difficult, but the past decade has seen substantial progress. For…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
