Box-DETR: Understanding and Boxing Conditional Spatial Queries

Wenze Liu; Hao Lu; Yuliang Liu; Zhiguo Cao

arXiv:2307.08353·cs.CV·July 18, 2023·1 cites

Box-DETR: Understanding and Boxing Conditional Spatial Queries

Wenze Liu, Hao Lu, Yuliang Liu, Zhiguo Cao

PDF

Open Access 1 Repo

TL;DR

This paper introduces Box-DETR, which enhances conditional spatial queries in DETR by using box agent points for better reference, leading to faster convergence and improved detection accuracy.

Contribution

It proposes Box Agent to incorporate full box information into cross-attention, significantly improving DETR's performance with minimal computational overhead.

Findings

01

Faster convergence in object detection models.

02

Improved detection accuracy with Box Agent.

03

Achieved 44.2 AP on ResNet-50 with single-scale model.

Abstract

Conditional spatial queries are recently introduced into DEtection TRansformer (DETR) to accelerate convergence. In DAB-DETR, such queries are modulated by the so-called conditional linear projection at each decoder stage, aiming to search for positions of interest such as the four extremities of the box. Each decoder stage progressively updates the box by predicting the anchor box offsets, while in cross-attention only the box center is informed as the reference point. The use of only box center, however, leaves the width and height of the previous box unknown to the current stage, which hinders accurate prediction of offsets. We argue that the explicit use of the entire box information in cross-attention matters. In this work, we propose Box Agent to condense the box into head-specific agent points. By replacing the box center with the agent point as the reference point in each head,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tiny-smart/box-detr
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications