DETReg: Unsupervised Pretraining with Region Priors for Object Detection
Amir Bar, Xin Wang, Vadim Kantorov, Colorado J Reed, Roei Herzig, Gal, Chechik, Anna Rohrbach, Trevor Darrell, Amir Globerson

TL;DR
DETReg is a novel self-supervised pretraining method for object detection that trains the entire detection network, including localization and embedding components, leading to improved performance especially in low-data scenarios.
Contribution
It introduces a comprehensive pretraining approach for the full detection architecture, integrating region priors and self-supervised embeddings, which was lacking in prior methods.
Findings
Improves detection performance on COCO, PASCAL VOC, and Airbus Ship benchmarks.
Enhances low-data and few-shot detection accuracy.
Outperforms competitive baselines in various settings.
Abstract
Recent self-supervised pretraining methods for object detection largely focus on pretraining the backbone of the object detector, neglecting key parts of detection architecture. Instead, we introduce DETReg, a new self-supervised method that pretrains the entire object detection network, including the object localization and embedding components. During pretraining, DETReg predicts object localizations to match the localizations from an unsupervised region proposal generator and simultaneously aligns the corresponding feature embeddings with embeddings from a self-supervised image encoder. We implement DETReg using the DETR family of detectors and show that it improves over competitive baselines when finetuned on COCO, PASCAL VOC, and Airbus Ship benchmarks. In low-data regimes DETReg achieves improved performance, e.g., when training with only 1% of the labels and in the few-shot…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Deformable Attention Module · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Deformable DETR · Adam · Label Smoothing · Residual Connection · Dense Connections
