Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Xinlong Wang; Rufeng Zhang; Chunhua Shen; Tao Kong; Lei Li

arXiv:2011.09157·cs.CV·April 6, 2021·40 cites

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li

PDF

Open Access 5 Repos

TL;DR

This paper introduces dense contrastive learning, a self-supervised approach that optimizes pixel-level feature correspondence, significantly improving performance on dense prediction tasks with minimal additional computational cost.

Contribution

It proposes a novel dense contrastive learning method that directly aligns local features at the pixel level, outperforming existing methods like MoCo-v2 on various dense prediction benchmarks.

Findings

01

Achieves up to 2.0% AP improvement on PASCAL VOC object detection

02

Outperforms state-of-the-art methods on multiple dense prediction tasks

03

Adds less than 1% computational overhead compared to MoCo-v2

Abstract

To date, most existing self-supervised learning methods are designed and optimized for image classification. These pre-trained models can be sub-optimal for dense prediction tasks due to the discrepancy between image-level prediction and pixel-level prediction. To fill this gap, we aim to design an effective, dense self-supervised learning method that directly works at the level of pixels (or local features) by taking into account the correspondence between local features. We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images. Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only <1% slower), but demonstrates consistently superior performance when transferring to downstream dense prediction tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques

MethodsDense Contrastive Learning