MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection

Yuxiang Wang; Xuecheng Bai; Boyu Hu; Chuanzhi Xu; Haodong Chen; Vera Chung; Tingxue Li; Xiaoming Chen

arXiv:2506.12697·cs.CV·August 14, 2025

MGDFIS: Multi-scale Global-detail Feature Integration Strategy for Small Object Detection

Yuxiang Wang, Xuecheng Bai, Boyu Hu, Chuanzhi Xu, Haodong Chen, Vera Chung, Tingxue Li, Xiaoming Chen

PDF

Open Access

TL;DR

MGDFIS introduces a unified multi-scale feature integration framework that enhances small object detection in UAV imagery by combining global context and local details efficiently.

Contribution

It proposes a novel fusion strategy with three modules that improve detection accuracy while maintaining computational efficiency.

Findings

01

Outperforms state-of-the-art methods on VisDrone benchmark.

02

Achieves higher precision and recall across various architectures.

03

Maintains low inference time suitable for resource-constrained UAVs.

Abstract

Small object detection in UAV imagery is crucial for applications such as search-and-rescue, traffic monitoring, and environmental surveillance, but it is hampered by tiny object size, low signal-to-noise ratios, and limited feature extraction. Existing multi-scale fusion methods help, but add computational burden and blur fine details, making small object detection in cluttered scenes difficult. To overcome these challenges, we propose the Multi-scale Global-detail Feature Integration Strategy (MGDFIS), a unified fusion framework that tightly couples global context with local detail to boost detection performance while maintaining efficiency. MGDFIS comprises three synergistic modules: the FusionLock-TSS Attention Module, which marries token-statistics self-attention with DynamicTanh normalization to highlight spectral and spatial cues at minimal cost; the Global-detail Integration…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques