Fusion Subspace Clustering for Incomplete Data
Usman Mahmood, Daniel Pimentel-Alarc\'on

TL;DR
Fusion subspace clustering is a new approach for learning low-dimensional structures in large, incomplete datasets by fusing subspaces of data points, which improves clustering and data completion especially with missing data.
Contribution
The paper introduces fusion subspace clustering, a novel method that handles incomplete data, accounts for noise, and approaches the information-theoretic sample complexity limit.
Findings
Performs comparably to state-of-the-art with complete data
Dramatically better performance with missing data
Provides convergence guarantees and a natural model selection method
Abstract
This paper introduces {\em fusion subspace clustering}, a novel method to learn low-dimensional structures that approximate large scale yet highly incomplete data. The main idea is to assign each datum to a subspace of its own, and minimize the distance between the subspaces of all data, so that subspaces of the same cluster get {\em fused} together. Our method allows low, high, and even full-rank data; it directly accounts for noise, and its sample complexity approaches the information-theoretic limit. In addition, our approach provides a natural model selection {\em clusterpath}, and a direct completion method. We give convergence guarantees, analyze computational complexity, and show through extensive experiments on real and synthetic data that our approach performs comparably to the state-of-the-art with complete data, and dramatically better if data is missing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Advanced Clustering Algorithms Research · Anomaly Detection Techniques and Applications
