LSD-C: Linearly Separable Deep Clusters

Sylvestre-Alvise Rebuffi; Sebastien Ehrhardt; Kai Han; Andrea Vedaldi,; Andrew Zisserman

arXiv:2006.10039·cs.CV·June 18, 2020

LSD-C: Linearly Separable Deep Clusters

Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi,, Andrew Zisserman

PDF

1 Repo

TL;DR

LSD-C introduces a clustering method that enforces linear separability in feature space, improving unsupervised clustering performance on image and text datasets by combining pairwise similarity, self-supervised pretraining, and data augmentation.

Contribution

The paper proposes a novel clustering algorithm that ensures linear separability of clusters in deep feature space, enhancing unsupervised learning effectiveness.

Findings

01

Outperforms existing methods on CIFAR 10/100, STL 10, MNIST, and Reuters 10K datasets.

02

Effectively combines pairwise similarity, self-supervised pretraining, and data augmentation.

03

Achieves significant improvements in clustering accuracy and separation quality.

Abstract

We present LSD-C, a novel method to identify clusters in an unlabeled dataset. Our algorithm first establishes pairwise connections in the feature space between the samples of the minibatch based on a similarity metric. Then it regroups in clusters the connected samples and enforces a linear separation between clusters. This is achieved by using the pairwise connections as targets together with a binary cross-entropy loss on the predictions that the associated pairs of samples belong to the same cluster. This way, the feature representation of the network will evolve such that similar samples in this feature space will belong to the same linearly separated cluster. Our method draws inspiration from recent semi-supervised learning practice and proposes to combine our clustering algorithm with self-supervised pretraining and strong data augmentation. We show that our approach…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

srebuffi/lsd-clusters
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.