Rank-DETR for High Quality Object Detection
Yifan Pu, Weicong Liang, Yiduo Hao, Yuhui Yuan, Yukang Yang, Chao, Zhang, Han Hu, Gao Huang

TL;DR
Rank-DETR introduces rank-oriented architecture and loss functions to improve the accuracy of object ranking and localization in DETR-based detectors, leading to higher quality detection results across various backbones.
Contribution
The paper proposes a novel rank-oriented design and loss function for DETR-based detectors, enhancing localization accuracy and ranking quality for better detection performance.
Findings
Achieves state-of-the-art COCO detection results with various backbones
Improves localization accuracy at high IoU thresholds
Reduces false positive rates in object detection
Abstract
Modern detection transformers (DETRs) use a set of object queries to predict a list of bounding boxes, sort them by their classification confidence scores, and select the top-ranked predictions as the final detection results for the given input image. A highly performant object detector requires accurate ranking for the bounding box predictions. For DETR-based detectors, the top-ranked bounding boxes suffer from less accurate localization quality due to the misalignment between classification scores and localization accuracy, thus impeding the construction of high-quality detectors. In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs, combinedly called Rank-DETR. Our key contributions include: (i) a rank-oriented architecture design that can prompt positive predictions and suppress the negative ones to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
