Joint Optimization of an Autoencoder for Clustering and Embedding

Ahc\`ene Boubekki; Michael Kampffmeyer; Robert Jenssen; Ulf Brefeld

arXiv:2012.03740·stat.ML·May 4, 2021

Joint Optimization of an Autoencoder for Clustering and Embedding

Ahc\`ene Boubekki, Michael Kampffmeyer, Robert Jenssen, Ulf Brefeld

PDF

1 Repo

TL;DR

This paper introduces a novel deep clustering approach that jointly optimizes an autoencoder and clustering, leveraging a theoretical link to Gaussian mixture models to improve unsupervised categorization.

Contribution

It presents a unified deep clustering model that learns embeddings and clusters simultaneously, based on a theoretical connection between GMMs and autoencoder loss functions.

Findings

01

Outperforms baseline methods on multiple datasets

02

Theoretical equivalence between GMMs and autoencoder-based clustering

03

Joint optimization improves clustering quality

Abstract

Deep embedded clustering has become a dominating approach to unsupervised categorization of objects with deep neural networks. The optimization of the most popular methods alternates between the training of a deep autoencoder and a k-means clustering of the autoencoder's embedding. The diachronic setting, however, prevents the former to benefit from valuable information acquired by the latter. In this paper, we present an alternative where the autoencoder and the clustering are learned simultaneously. This is achieved by providing novel theoretical insight, where we show that the objective function of a certain class of Gaussian mixture models (GMMs) can naturally be rephrased as the loss function of a one-hidden layer autoencoder thus inheriting the built-in clustering capabilities of the GMM. That simple neural network, referred to as the clustering module, can be integrated into a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Ahcene-B/clustering-Module
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methodsk-Means Clustering