A Comparative Attention Framework for Better Few-Shot Object Detection on Aerial Images
Pierre Le Jeune, Anissa Mokraoui

TL;DR
This paper introduces a benchmarking framework for attention-based Few-Shot Object Detection (FSOD) methods on aerial images, revealing performance gaps and proposing a novel multiscale alignment technique to enhance small object detection.
Contribution
It presents a flexible benchmarking framework for attention-based FSOD, compares existing methods, and introduces XQSA, a new multiscale alignment approach that improves small object detection in aerial images.
Findings
FSOD performs worse on aerial images than natural images.
Small objects are a key challenge in few-shot detection.
XQSA significantly outperforms existing methods on DOTA and DIOR datasets.
Abstract
Few-Shot Object Detection (FSOD) methods are mainly designed and evaluated on natural image datasets such as Pascal VOC and MS COCO. However, it is not clear whether the best methods for natural images are also the best for aerial images. Furthermore, direct comparison of performance between FSOD methods is difficult due to the wide variety of detection frameworks and training strategies. Therefore, we propose a benchmarking framework that provides a flexible environment to implement and compare attention-based FSOD methods. The proposed framework focuses on attention mechanisms and is divided into three modules: spatial alignment, global attention, and fusion layer. To remain competitive with existing methods, which often leverage complex training, we propose new augmentation techniques designed for object detection. Using this framework, several FSOD methods are reimplemented and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
