Semi-DETR: Semi-Supervised Object Detection with Detection Transformers
Jiacheng Zhang, Xiangru Lin, Wei Zhang, Kuo Wang, Xiao Tan, Junyu Han,, Errui Ding, Jingdong Wang, Guanbin Li

TL;DR
Semi-DETR introduces a transformer-based semi-supervised object detection framework that improves training efficiency and pseudo label quality through hybrid matching, cross-view query consistency, and cost-based pseudo label mining.
Contribution
It presents the first end-to-end semi-supervised detector based on transformers, with novel strategies for matching, consistency learning, and pseudo label mining.
Findings
Outperforms state-of-the-art methods on COCO and Pascal VOC datasets.
Effective in improving pseudo label quality and training efficiency.
Achieves significant performance gains across various SSOD settings.
Abstract
We analyze the DETR-based framework on semi-supervised object detection (SSOD) and observe that (1) the one-to-one assignment strategy generates incorrect matching when the pseudo ground-truth bounding box is inaccurate, leading to training inefficiency; (2) DETR-based detectors lack deterministic correspondence between the input query and its prediction output, which hinders the applicability of the consistency-based regularization widely used in current SSOD methods. We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector, to tackle these problems. Specifically, we propose a Stage-wise Hybrid Matching strategy that combines the one-to-many assignment and one-to-one assignment strategies to improve the training efficiency of the first stage and thus provide high-quality pseudo labels for the training of the second stage. Besides, we introduce a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
