# RTUAV-YOLO: A Family of Efficient and Lightweight Models for Real-Time Object Detection in UAV Aerial Imagery

**Authors:** Ruizhi Zhang, Jinghua Hou, Le Li, Ke Zhang, Li Zhao, Shuo Gao

PMC · DOI: 10.3390/s25216573 · Sensors (Basel, Switzerland) · 2025-10-25

## TL;DR

RTUAV-YOLO is a new family of lightweight models designed for efficient and accurate real-time object detection in UAV aerial imagery, especially for small objects.

## Contribution

The paper introduces RTUAV-YOLO, a novel family of models with four key architectural innovations for UAV object detection.

## Key findings

- RTUAV-YOLO improves mAP50 and mAP50-95 by 3.4% and 2.4%, respectively, while reducing model parameters by 65.3%.
- The model demonstrates strong generalization on UAVDT and UAVVaste datasets.
- RTUAV-YOLO is successfully deployed on Jetson Orin Nano for real-time UAV object detection.

## Abstract

Real-time object detection in Unmanned Aerial Vehicle (UAV) imagery is critical yet challenging, requiring high accuracy amidst complex scenes with multi-scale and small objects, under stringent onboard computational constraints. While existing methods struggle to balance accuracy and efficiency, we propose RTUAV-YOLO, a family of lightweight models based on YOLOv11 tailored for UAV real-time object detection. First, to mitigate the feature imbalance and progressive information degradation of small objects in current architectures multi-scale processing, we developed a Multi-Scale Feature Adaptive Modulation module (MSFAM) that enhances small-target feature extraction capabilities through adaptive weight generation mechanisms and dual-pathway heterogeneous feature aggregation. Second, to overcome the limitations in contextual information acquisition exhibited by current architectures in complex scene analysis, we propose a Progressive Dilated Separable Convolution Module (PDSCM) that achieves effective aggregation of multi-scale target contextual information through continuous receptive field expansion. Third, to preserve fine-grained spatial information of small objects during feature map downsampling operations, we engineered a Lightweight DownSampling Module (LDSM) to replace the traditional convolutional module. Finally, to rectify the insensitivity of current Intersection over Union (IoU) metrics toward small objects, we introduce the Minimum Point Distance Wise IoU (MPDWIoU) loss function, which enhances small-target localization precision through the integration of distance-aware penalty terms and adaptive weighting mechanisms. Comprehensive experiments on the VisDrone2019 dataset show that RTUAV-YOLO achieves an average improvement of 3.4% and 2.4% in mAP50 and mAP50-95, respectively, compared to the baseline model, while reducing the number of parameters by 65.3%. Its generalization capability for UAV object detection is further validated on the UAVDT and UAVVaste datasets. The proposed model is deployed on a typical airborne platform, Jetson Orin Nano, providing an effective solution for real-time object detection scenarios in actual UAVs.

## Full-text entities

- **Diseases:** LDSM (MESH:C538399), injury to (MESH:D014947), PDSCM (MESH:D002311)
- **Chemicals:** FLOP (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** YOLOv11-S — Mus musculus (Mouse), Spontaneously immortalized cell line (CVCL_ZJ86)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12608591/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12608591/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/PMC12608591/full.md

---
Source: https://tomesphere.com/paper/PMC12608591