Sparse outlier-robust PCA for multi-source data

Patricia Puchhammer; Ines Wilms; Peter Filzmoser

arXiv:2407.16299·stat.ME·February 26, 2026·Stat. Comput.

Sparse outlier-robust PCA for multi-source data

Patricia Puchhammer, Ines Wilms, Peter Filzmoser

PDF

TL;DR

This paper introduces a novel sparse outlier-robust PCA method designed for multi-source data, enabling feature selection, detection of global and local sparse patterns, and outlier resistance across multiple datasets.

Contribution

The paper presents a new PCA approach that jointly analyzes multiple datasets, incorporating structured sparsity and outlier robustness, which was not addressed by prior single-source methods.

Findings

01

Effective feature selection across multiple sources

02

Detection of global and local sparse patterns

03

Robust performance in simulations and real applications

Abstract

Sparse and outlier-robust Principal Component Analysis (PCA) has been a very active field of research recently. Yet, most existing methods apply PCA to a single dataset whereas multi-source data-i.e. multiple related datasets requiring joint analysis-arise across many scientific areas. We introduce a novel PCA methodology that simultaneously (i) selects important features, (ii) allows for the detection of global sparse patterns across multiple data sources as well as local source-specific patterns, and (iii) is resistant to outliers. To this end, we develop a regularization problem with a penalty that accommodates global-local structured sparsity patterns, and where the ssMRCD estimator is used as plug-in to permit joint outlier-robust analysis across multiple data sources. We provide an efficient implementation of our proposal via the Alternating Direction Method of Multiplier and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPrincipal Components Analysis