RefAerial: A Benchmark and Approach for Referring Detection in Aerial Images
Guyue Hu, Hao Song, Yuxing Tong, Duzhi Yuan, Dengdi Sun, Aihua Zheng, Chenglong Li, Jin Tang

TL;DR
This paper introduces RefAerial, a challenging large-scale aerial image dataset for referring detection, along with a novel SCS framework that addresses scale variation issues and improves detection accuracy.
Contribution
The paper presents a new aerial referring detection dataset, RefAerial, and proposes the SCS framework with MoG attention and CtS decoding for better scale-aware detection.
Findings
RefAerial dataset exhibits diverse scenes and complex descriptions.
SCS framework outperforms existing methods on aerial images.
Method also improves performance on ground referring detection datasets.
Abstract
Referring detection refers to locate the target referred by natural languages, which has recently attracted growing research interests. However, existing datasets are limited to ground images with large object centered in relative small scenes. This paper introduces a large-scale challenging dataset for referring detection in aerial images, termed as RefAerial. It distinguishes from conventional ground referring detection datasets by 4 characteristics: (1) low but diverse object-to-scene ratios, (2) numerous targets and distractors, (3)complex and fine-grained referring descriptions, (4) diverse and broad scenes in the aerial view. We also develop a human-in-the-loop referring expansion and annotation engine (REA-Engine) for efficient semi-automated referring pair annotation. Besides, we observe that existing ground referring detection approaches exhibiting serious performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
