Self-labelling via simultaneous clustering and representation learning
Yuki Markus Asano, Christian Rupprecht, Andrea Vedaldi

TL;DR
This paper introduces a novel unsupervised learning method that combines clustering and representation learning by maximizing mutual information, enabling effective self-labeling of images without manual annotations.
Contribution
It formulates a new principled approach extending crossentropy to an optimal transport problem, solved efficiently with Sinkhorn-Knopp, achieving state-of-the-art results in self-supervised image representation learning.
Findings
Achieves state-of-the-art performance on SVHN, CIFAR-10, CIFAR-100, and ImageNet.
First self-supervised AlexNet outperforming supervised Pascal VOC detection baseline.
Efficiently scales to millions of images and thousands of labels.
Abstract
Combining clustering and representation learning is one of the most promising approaches for unsupervised learning of deep neural networks. However, doing so naively leads to ill posed learning problems with degenerate solutions. In this paper, we propose a novel and principled learning formulation that addresses these issues. The method is obtained by maximizing the information between labels and input data indices. We show that this criterion extends standard crossentropy minimization to an optimal transport problem, which we solve efficiently for millions of input images and thousands of labels using a fast variant of the Sinkhorn-Knopp algorithm. The resulting method is able to self-label visual data so as to train highly competitive image representations without manual labels. Our method achieves state of the art representation learning performance for AlexNet and ResNet-50 on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning
Methods1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax · How do I speak to a person at Expedia?-/+/
