Improving k-Means Clustering Performance with Disentangled Internal   Representations

Abien Fred Agarap; Arnulfo P. Azcarraga

arXiv:2006.04535·cs.LG·June 9, 2020

Improving k-Means Clustering Performance with Disentangled Internal Representations

Abien Fred Agarap, Arnulfo P. Azcarraga

PDF

1 Repo

TL;DR

This paper introduces a simple method to improve deep clustering by optimizing the disentanglement of latent representations using a modified soft nearest neighbor loss, leading to higher clustering accuracy.

Contribution

It proposes a novel approach of enhancing clustering performance through disentangled latent representations without complex joint optimization frameworks.

Findings

01

Achieved 96.2% accuracy on MNIST

02

Outperformed baseline models on Fashion-MNIST

03

Improved clustering on EMNIST Balanced dataset

Abstract

Deep clustering algorithms combine representation learning and clustering by jointly optimizing a clustering loss and a non-clustering loss. In such methods, a deep neural network is used for representation learning together with a clustering network. Instead of following this framework to improve clustering performance, we propose a simpler approach of optimizing the entanglement of the learned latent code representation of an autoencoder. We define entanglement as how close pairs of points from the same class or structure are, relative to pairs of points from different classes or structures. To measure the entanglement of data points, we use the soft nearest neighbor loss, and expand it by introducing an annealing temperature factor. Using our proposed approach, the test clustering accuracy was 96.2% on the MNIST dataset, 85.6% on the Fashion-MNIST dataset, and 79.2% on the EMNIST…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://gitlab.com/afagarap/pt-snnl
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSoft Nearest Neighbor Loss with Annealing Temperature