Revisiting data augmentation for subspace clustering
Maryam Abdolali, Nicolas Gillis

TL;DR
This paper investigates the impact of data distribution on subspace clustering and introduces augmentation-based frameworks that enhance the performance of self-expressive models in both unsupervised and semi-supervised settings.
Contribution
It highlights the importance of data distribution within subspaces and proposes augmentation strategies to improve self-expressive subspace clustering methods.
Findings
Data augmentation significantly improves clustering accuracy.
Augmentation-based frameworks outperform traditional methods.
Semi-supervised approach effectively utilizes limited labeled data.
Abstract
Subspace clustering is the classical problem of clustering a collection of data samples that approximately lie around several low-dimensional subspaces. The current state-of-the-art approaches for this problem are based on the self-expressive model which represents the samples as linear combination of other samples. However, these approaches require sufficiently well-spread samples for accurate representation which might not be necessarily accessible in many applications. In this paper, we shed light on this commonly neglected issue and argue that data distribution within each subspace plays a critical role in the success of self-expressive models. Our proposed solution to tackle this issue is motivated by the central role of data augmentation in the generalization power of deep neural networks. We propose two subspace clustering frameworks for both unsupervised and semi-supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
