Aligning Pretraining for Detection via Object-Level Contrastive Learning
Fangyun Wei, Yue Gao, Zhirong Wu, Han Hu, Stephen Lin

TL;DR
This paper introduces SoCo, a pretraining method that aligns self-supervised learning with object detection tasks by focusing on object-level representations and detection-specific properties, leading to improved transfer performance.
Contribution
The paper proposes a novel object-level contrastive pretraining method, SoCo, that aligns pretraining with detection tasks by incorporating object proposals and detection modules.
Findings
Achieves state-of-the-art transfer results on COCO detection.
Outperforms previous methods in object detection transfer tasks.
Demonstrates the effectiveness of object-level alignment in pretraining.
Abstract
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning. Such generality for transfer learning, however, sacrifices specificity if we are interested in a certain downstream task. We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task. In this paper, we follow this principle with a pretraining method specifically designed for the task of object detection. We attain alignment in the following three aspects: 1) object-level representations are introduced via selective search bounding boxes as object proposals; 2) the pretraining network architecture incorporates the same dedicated modules used in the detection pipeline (e.g. FPN); 3) the pretraining is equipped with object detection properties such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications
MethodsRegion Proposal Network · Contrastive Learning · Convolution · Softmax · Selective Search · RoIAlign · Mask R-CNN
