Fusion Subspace Clustering: Full and Incomplete Data

Daniel L. Pimentel-Alarc\'on; Usman Mahmood

arXiv:1808.00628·cs.LG·August 3, 2018

Fusion Subspace Clustering: Full and Incomplete Data

Daniel L. Pimentel-Alarc\'on, Usman Mahmood

PDF

Open Access

TL;DR

This paper introduces a novel subspace clustering algorithm that effectively handles both complete and incomplete data by fusing subspaces, outperforming existing methods especially when data is missing.

Contribution

The proposed fusion subspace clustering method is the first to directly address both full and incomplete data without lifting or matrix completion, achieving near-optimal sampling rates.

Findings

01

Performs comparably to state-of-the-art with complete data

02

Significantly better performance with missing data

03

Handles noise and various data ranks effectively

Abstract

Modern inference and learning often hinge on identifying low-dimensional structures that approximate large scale data. Subspace clustering achieves this through a union of linear subspaces. However, in contemporary applications data is increasingly often incomplete, rendering standard (full-data) methods inapplicable. On the other hand, existing incomplete-data methods present major drawbacks, like lifting an already high-dimensional problem, or requiring a super polynomial number of samples. Motivated by this, we introduce a new subspace clustering algorithm inspired by fusion penalties. The main idea is to permanently assign each datum to a subspace of its own, and minimize the distance between the subspaces of all data, so that subspaces of the same cluster get fused together. Our approach is entirely new to both, full and missing data, and unlike other methods, it directly allows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Data Quality and Management · Advanced Clustering Algorithms Research