Unsupervised visualization of image datasets using contrastive learning
Jan Niklas B\"ohm, Philipp Berens, Dmitry Kobak

TL;DR
This paper introduces t-SimCNE, a novel unsupervised visualization method that combines contrastive learning and neighbor embeddings to produce meaningful 2D visualizations of image datasets, capturing semantic relationships effectively.
Contribution
The paper proposes t-SimCNE, a new parametric method that generates 2D visualizations from high-dimensional image data using contrastive learning, improving interpretability and semantic fidelity.
Findings
t-SimCNE achieves classification accuracy comparable to high-dimensional SimCLR.
It produces informative visualizations with clear cluster structures.
The method highlights artifacts and outliers effectively.
Abstract
Visualization methods based on the nearest neighbor graph, such as t-SNE or UMAP, are widely used for visualizing high-dimensional data. Yet, these approaches only produce meaningful results if the nearest neighbors themselves are meaningful. For images represented in pixel space this is not the case, as distances in pixel space are often not capturing our sense of similarity and therefore neighbors are not semantically close. This problem can be circumvented by self-supervised approaches based on contrastive learning, such as SimCLR, relying on data augmentation to generate implicit neighbors, but these methods do not produce two-dimensional embeddings suitable for visualization. Here, we present a new method, called t-SimCNE, for unsupervised visualization of image data. T-SimCNE combines ideas from contrastive learning and neighbor embeddings, and trains a parametric mapping from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsData Visualization and Analytics · Advanced Clustering Algorithms Research · Data Analysis with R
MethodsBatch Normalization · 1x1 Convolution · Residual Connection · Kaiming Initialization · Max Pooling · Dense Connections · Average Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block
