Fusion Subspace Clustering: Full and Incomplete Data
Daniel L. Pimentel-Alarc\'on, Usman Mahmood

TL;DR
This paper introduces a novel subspace clustering algorithm that effectively handles both complete and incomplete data by fusing subspaces, outperforming existing methods especially when data is missing.
Contribution
The proposed fusion subspace clustering method is the first to directly address both full and incomplete data without lifting or matrix completion, achieving near-optimal sampling rates.
Findings
Performs comparably to state-of-the-art with complete data
Significantly better performance with missing data
Handles noise and various data ranks effectively
Abstract
Modern inference and learning often hinge on identifying low-dimensional structures that approximate large scale data. Subspace clustering achieves this through a union of linear subspaces. However, in contemporary applications data is increasingly often incomplete, rendering standard (full-data) methods inapplicable. On the other hand, existing incomplete-data methods present major drawbacks, like lifting an already high-dimensional problem, or requiring a super polynomial number of samples. Motivated by this, we introduce a new subspace clustering algorithm inspired by fusion penalties. The main idea is to permanently assign each datum to a subspace of its own, and minimize the distance between the subspaces of all data, so that subspaces of the same cluster get fused together. Our approach is entirely new to both, full and missing data, and unlike other methods, it directly allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Data Quality and Management · Advanced Clustering Algorithms Research
