Fusion Subspace Clustering for Incomplete Data

Usman Mahmood; Daniel Pimentel-Alarc\'on

arXiv:2205.10872·cs.LG·May 24, 2022

Fusion Subspace Clustering for Incomplete Data

Usman Mahmood, Daniel Pimentel-Alarc\'on

PDF

Open Access

TL;DR

Fusion subspace clustering is a new approach for learning low-dimensional structures in large, incomplete datasets by fusing subspaces of data points, which improves clustering and data completion especially with missing data.

Contribution

The paper introduces fusion subspace clustering, a novel method that handles incomplete data, accounts for noise, and approaches the information-theoretic sample complexity limit.

Findings

01

Performs comparably to state-of-the-art with complete data

02

Dramatically better performance with missing data

03

Provides convergence guarantees and a natural model selection method

Abstract

This paper introduces {\em fusion subspace clustering}, a novel method to learn low-dimensional structures that approximate large scale yet highly incomplete data. The main idea is to assign each datum to a subspace of its own, and minimize the distance between the subspaces of all data, so that subspaces of the same cluster get {\em fused} together. Our method allows low, high, and even full-rank data; it directly accounts for noise, and its sample complexity approaches the information-theoretic limit. In addition, our approach provides a natural model selection {\em clusterpath}, and a direct completion method. We give convergence guarantees, analyze computational complexity, and show through extensive experiments on real and synthetic data that our approach performs comparably to the state-of-the-art with complete data, and dramatically better if data is missing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Anomaly Detection Techniques and Applications