ALPCAHUS: Subspace Clustering for Heteroscedastic Data
Javier Salazar Cavazos, Jeffrey A Fessler, Laura Balzano

TL;DR
ALPCAHUS is a novel subspace clustering method that accounts for heteroscedastic noise in data, improving clustering accuracy by estimating sample-specific noise variances, demonstrated through simulations and real data.
Contribution
It extends heteroscedastic PCA to subspace clustering, enabling estimation of sample-wise noise variances to enhance clustering in heterogeneous data.
Findings
Outperforms existing clustering algorithms on heteroscedastic data
Effectively estimates sample-specific noise variances
Improves subspace basis estimation accuracy
Abstract
Principal component analysis (PCA) is a key tool in the field of data dimensionality reduction. Various methods have been proposed to extend PCA to the union of subspace (UoS) setting for clustering data that comes from multiple subspaces like K-Subspaces (KSS). However, some applications involve heterogeneous data that vary in quality due to noise characteristics associated with each data sample. Heteroscedastic methods aim to deal with such mixed data quality. This paper develops a heteroscedastic-based subspace clustering method, named ALPCAHUS, that can estimate the sample-wise noise variances and use this information to improve the estimate of the subspace bases associated with the low-rank structure of the data. This clustering algorithm builds on K-Subspaces (KSS) principles by extending the recently proposed heteroscedastic PCA method, named LR-ALPCAH, for clusters with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research
MethodsPrincipal Components Analysis
