Cascade-DETR: Delving into High-Quality Universal Object Detection
Mingqiao Ye, Lei Ke, Siyuan Li, Yu-Wing Tai, Chi-Keung Tang, Martin, Danelljan, Fisher Yu

TL;DR
Cascade-DETR advances universal object detection by integrating object-centric attention and IoU-based scoring, significantly improving accuracy across diverse datasets including COCO and UDB10.
Contribution
The paper introduces Cascade Attention and IoU prediction techniques to enhance generalization and localization precision in Transformer-based detectors.
Findings
Outperforms state-of-the-art on COCO and UDB10 datasets.
Achieves over 10 mAP improvement in some cases.
Excels under stringent quality requirements.
Abstract
Object localization in general environments is a fundamental part of vision systems. While dominating on the COCO benchmark, recent Transformer-based detection methods are not competitive in diverse domains. Moreover, these methods still struggle to very accurately estimate the object bounding boxes in complex environments. We introduce Cascade-DETR for high-quality universal object detection. We jointly tackle the generalization to diverse domains and localization accuracy by proposing the Cascade Attention layer, which explicitly integrates object-centric information into the detection decoder by limiting the attention to the previous box prediction. To further enhance accuracy, we also revisit the scoring of queries. Instead of relying on classification scores, we predict the expected IoU of the query, leading to substantially more well-calibrated confidences. Lastly, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
