Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art
Aref Miri Rekavandi, Shima Rashidi, Farid Boussaid, Stephen Hoefs,, Emre Akbas, Mohammed bennamoun

TL;DR
This paper surveys the use of transformer-based models for small object detection across various domains, highlighting their superior performance over CNNs and providing a comprehensive taxonomy, datasets, and evaluation metrics.
Contribution
It offers the first extensive taxonomy of over 60 transformer-based SOD studies from 2020 to 2023, along with a curated list of overlooked datasets and performance comparisons.
Findings
Transformers outperform CNNs in small object detection across datasets.
A comprehensive taxonomy of transformer-based SOD methods is provided.
Performance metrics like mAP and FPS are used for comparison.
Abstract
Transformers have rapidly gained popularity in computer vision, especially in the field of object recognition and detection. Upon examining the outcomes of state-of-the-art object detection methods, we noticed that transformers consistently outperformed well-established CNN-based detectors in almost every video or image dataset. While transformer-based approaches remain at the forefront of small object detection (SOD) techniques, this paper aims to explore the performance benefits offered by such extensive networks and identify potential reasons for their SOD superiority. Small objects have been identified as one of the most challenging object types in detection frameworks due to their low visibility. We aim to investigate potential strategies that could enhance transformers' performance in SOD. This survey presents a taxonomy of over 60 research studies on developed transformers for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Visual Attention and Saliency Detection · IoT and Edge/Fog Computing
