TL;DR
This paper introduces a contrastive detection method for self-supervised visual pretraining that achieves high transfer accuracy with significantly less computation, making it more efficient than existing methods.
Contribution
The paper proposes a novel contrastive detection objective that reduces pretraining computational costs while maintaining state-of-the-art transfer performance.
Findings
Achieves state-of-the-art transfer accuracy with up to 10x less pretraining.
Performs on par with large-scale systems using 1000x less data.
Effectively handles complex images like COCO, closing the gap with supervised learning.
Abstract
Self-supervised pretraining has been shown to yield powerful representations for transfer learning. These performance gains come at a large computational cost however, with state-of-the-art methods requiring an order of magnitude more computation than supervised pretraining. We tackle this computational bottleneck by introducing a new self-supervised objective, contrastive detection, which tasks representations with identifying object-level features across augmentations. This objective extracts a rich learning signal per image, leading to state-of-the-art transfer accuracy on a variety of downstream tasks, while requiring up to 10x less pretraining. In particular, our strongest ImageNet-pretrained model performs on par with SEER, one of the largest self-supervised systems to date, which uses 1000x more pretraining data. Finally, our objective seamlessly handles pretraining on more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
DeepMind DetCon: Efficient Visual Pretraining with Contrastive Detection | Paper Explained· youtube
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Grouped Convolution · Dense Connections · 1x1 Convolution · Batch Normalization · Sigmoid Activation · Squeeze-and-Excitation Block · Average Pooling · Global Average Pooling · Convolution
