# Spatio-Temporal Feature Fusion for Anti-UAV Detection: Integrating Inter-Frame Dynamics and Appearance

**Authors:** Yake Zhang, Xiaoxi Fu, Yunfeng Zhou, Xiaojun Guo, Bei Sun, Yinglong Wang, Yongping Zhai

PMC · DOI: 10.3390/s26051492 · Sensors (Basel, Switzerland) · 2026-02-27

## TL;DR

This paper presents a new method for detecting small UAVs in complex environments by combining improved detection techniques and motion analysis.

## Contribution

The novel approach integrates spatio-temporal features using an improved YOLO detector and dynamic motion analysis for UAV detection.

## Key findings

- The proposed method increases Precision, Recall, and mAP50 by 12.1%, 29.5%, and 29.6% compared to the baseline YOLO11 detector.
- MSM-YOLO achieves 94% Precision, 92% Recall, and 86.3% mAP50 for small UAV detection in complex scenarios.
- The method was deployed on an RK3588 embedded system, achieving 100 fps and showing practicality in air-to-air UAV detection.

## Abstract

In order to improve the detection capability of low-slow-small UAV targets in complex backgrounds, this paper introduces a novel method that combines spatio-temporal information, which includes (1) an improved YOLO detector for small UAV detection, (2) a motion target detection module, and (3) an integrated combination strategy for static and dynamic judgment. We firstly provided an improved YOLOv11 static detection method by combining SPD Conv, BiFPN and a detect header for high-resolution layers, and then designed a dynamic target-detection algorithm which helps the YOLO method capture minor movement features, finally introducing a fusing strategy of static detection and dynamic judgment. The experimental results on small UAV datasets, including various sky, mountain and building backgrounds, have shown that the proposed approach increases Precision, Recall, and mAP50 by 12.1%, 29.5%, and 29.6%, respectively, compared with the baseline YOLO11 detector. The proposed MSM-YOLO achieves Precision, Recall, and mAP50 of 94%, 92%, and 86.3%, enabling the effective detection of small UAV targets in complex scenarios. Moreover, the ablation experiments also proved the effectiveness of each module. The proposed method was further deployed in a redesigned RK3588 embedded system, achieving 100 fps after optimized process, and it has shown effectiveness and practicality in further air-to-air UAV detection applications.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12986653/full.md

## Figures

18 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12986653/full.md

## References

46 references — full list in the complete paper: https://tomesphere.com/paper/PMC12986653/full.md

---
Source: https://tomesphere.com/paper/PMC12986653