TL;DR
Dual-R-DETR introduces pairwise routing in transformer decoders to explicitly regulate query interactions, improving object detection performance without increasing inference cost.
Contribution
It proposes a novel competition-aware framework with learnable routing biases that differentiate query relations, enhancing detection accuracy over existing DETR variants.
Findings
Outperforms DINO by 1.7% mAP on COCO with ResNet-50.
Achieves 57.6% mAP with Swin-L backbone.
Improves query diversity and detection efficiency.
Abstract
Detection Transformers (DETR) formulate object detection as a set prediction problem and enable end-to-end training without post-processing. However, object queries in DETR interact through symmetric self-attention, which enforces uniform competition among all query pairs. This often leads to inefficient query dynamics, where multiple queries converge to the same object while others fail to explore alternative regions. We propose Dual-R-DETR, a competition-aware DETR framework that explicitly regulates query interactions via pairwise routing in transformer decoders. Dual-R-DETR distinguishes query-to-query relations as either competitive or cooperative based on appearance similarity, prediction confidence, and spatial geometry. It introduces two complementary routing behaviors: suppressor routing to attenuate interactions among queries targeting the same object, and delegator routing to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
