MacroPCA: An all-in-one PCA method allowing for missing values as well as cellwise and rowwise outliers
Mia Hubert, Peter J. Rousseeuw, Wannes Van den Bossche

TL;DR
MacroPCA is a novel PCA method that simultaneously handles missing data, cellwise outliers, and rowwise outliers, providing a robust and versatile tool for multivariate data analysis.
Contribution
It introduces the first PCA approach capable of managing missing values, cellwise outliers, and rowwise outliers all at once, combining strengths of existing robust methods.
Findings
MacroPCA effectively detects outliers in complex data.
The method maintains robustness with high proportions of contaminated cells.
It is suitable for online process control applications.
Abstract
Multivariate data are typically represented by a rectangular matrix (table) in which the rows are the objects (cases) and the columns are the variables (measurements). When there are many variables one often reduces the dimension by principal component analysis (PCA), which in its basic form is not robust to outliers. Much research has focused on handling rowwise outliers, i.e. rows that deviate from the majority of the rows in the data (for instance, they might belong to a different population). In recent years also cellwise outliers are receiving attention. These are suspicious cells (entries) that can occur anywhere in the table. Even a relatively small proportion of outlying cells can contaminate over half the rows, which causes rowwise robust methods to break down. In this paper a new PCA method is constructed which combines the strengths of two existing robust methods in order to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
