QueryDet: Cascaded Sparse Query for Accelerating High-Resolution Small Object Detection
Chenhongyi Yang, Zehao Huang, Naiyan Wang

TL;DR
QueryDet introduces a cascaded sparse query mechanism that accelerates high-resolution small object detection by predicting coarse locations on low-res features and refining on high-res features, boosting speed and accuracy.
Contribution
The paper proposes a novel query-based approach that significantly speeds up small object detection at high resolution while maintaining accuracy.
Findings
Improves mAP by 1.0 and mAP-small by 2.0 on COCO dataset.
Achieves 3.0x speedup in high-resolution inference.
Sets new state-of-the-art on VisDrone dataset with 2.3x acceleration.
Abstract
While general object detection with deep learning has achieved great success in the past few years, the performance and efficiency of detecting small objects are far from satisfactory. The most common and effective way to promote small object detection is to use high-resolution images or feature maps. However, both approaches induce costly computation since the computational cost grows squarely as the size of images and features increases. To get the best of two worlds, we propose QueryDet that uses a novel query mechanism to accelerate the inference speed of feature-pyramid based object detectors. The pipeline composes two steps: it first predicts the coarse locations of small objects on low-resolution features and then computes the accurate detection results using high-resolution features sparsely guided by those coarse positions. In this way, we can not only harvest the benefit of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Advanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning
