ViCE: Improving Dense Representation Learning by Superpixelization and Contrasting Cluster Assignment
Robin Karlsson, Tomoki Hayashi, Keisuke Fujii, Alexander Carballo,, Kento Ohtani, Kazuya Takeda

TL;DR
This paper introduces superpixel-based regional contrastive learning to enhance dense visual representations, enabling high-resolution spatial detail preservation and improving unsupervised semantic segmentation performance.
Contribution
It proposes a novel superpixelization approach for dense self-supervised learning, reducing computational complexity and extending contrastive methods to high-resolution images.
Findings
Superpixel contrast improves dense embedding quality.
Method outperforms grid-based approaches and enhances segmentation benchmarks.
Regional masking further boosts performance.
Abstract
Recent self-supervised models have demonstrated equal or better performance than supervised methods, opening for AI systems to learn visual representations from practically unlimited data. However, these methods are typically classification-based and thus ineffective for learning high-resolution feature maps that preserve precise spatial information. This work introduces superpixels to improve self-supervised learning of dense semantically rich visual concept embeddings. Decomposing images into a small set of visually coherent regions reduces the computational complexity by while preserving detail. We experimentally show that contrasting over regions improves the effectiveness of contrastive learning methods, extends their applicability to high-resolution images, improves overclustering performance, superpixels are better than grids, and regional masking improves…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Remote-Sensing Image Classification · Advanced Image and Video Retrieval Techniques
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Batch Normalization · 1x1 Convolution · Residual Connection · SuperpixelGridCut, SuperpixelGridMean, SuperpixelGridMix · Max Pooling · Global Average Pooling · Bottleneck Residual Block · Residual Block
