Small Object Detection in Complex Backgrounds with Multi-Scale Attention and Global Relation Modeling
Wenguang Tao, Xiaotian Wang, Tian Yan, Yi Wang, Jie Yan

TL;DR
This paper introduces a novel multi-scale attention and global relation modeling framework specifically designed to improve small object detection in complex backgrounds, addressing issues like feature degradation and localization errors.
Contribution
The paper proposes a new framework with a Residual Haar Wavelet Downsampling, Global Relation Modeling, and Cross-Scale Hybrid Attention modules tailored for small object detection.
Findings
Outperforms state-of-the-art detectors on RGBT-Tiny benchmark.
Enhances feature preservation and semantic awareness for small objects.
Improves localization accuracy and robustness in complex environments.
Abstract
Small object detection under complex backgrounds remains a challenging task due to severe feature degradation, weak semantic representation, and inaccurate localization caused by downsampling operations and background interference. Existing detection frameworks are mainly designed for general objects and often fail to explicitly address the unique characteristics of small objects, such as limited structural cues and strong sensitivity to localization errors. In this paper, we propose a multi-level feature enhancement and global relation modeling framework tailored for small object detection. Specifically, a Residual Haar Wavelet Downsampling module is introduced to preserve fine-grained structural details by jointly exploiting spatial-domain convolutional features and frequency-domain representations. To enhance global semantic awareness and suppress background noise, a Global Relation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Video Surveillance and Tracking Methods
