Dense Contrastive Learning for Self-Supervised Visual Pre-Training
Xinlong Wang, Rufeng Zhang, Chunhua Shen, Tao Kong, Lei Li

TL;DR
This paper introduces dense contrastive learning, a self-supervised approach that optimizes pixel-level feature correspondence, significantly improving performance on dense prediction tasks with minimal additional computational cost.
Contribution
It proposes a novel dense contrastive learning method that directly aligns local features at the pixel level, outperforming existing methods like MoCo-v2 on various dense prediction benchmarks.
Findings
Achieves up to 2.0% AP improvement on PASCAL VOC object detection
Outperforms state-of-the-art methods on multiple dense prediction tasks
Adds less than 1% computational overhead compared to MoCo-v2
Abstract
To date, most existing self-supervised learning methods are designed and optimized for image classification. These pre-trained models can be sub-optimal for dense prediction tasks due to the discrepancy between image-level prediction and pixel-level prediction. To fill this gap, we aim to design an effective, dense self-supervised learning method that directly works at the level of pixels (or local features) by taking into account the correspondence between local features. We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images. Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only <1% slower), but demonstrates consistently superior performance when transferring to downstream dense prediction tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
MethodsDense Contrastive Learning
