Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models

Nouar AlDahoul; Aznul Qalid Md Sabri; Ali Mohammed Mansoor

arXiv:2601.00391·cs.LG·January 6, 2026

Real-Time Human Detection for Aerial Captured Video Sequences via Deep Models

Nouar AlDahoul, Aznul Qalid Md Sabri, Ali Mohammed Mansoor

PDF

TL;DR

This paper explores deep learning models for real-time human detection in aerial videos, demonstrating high accuracy and analyzing model performance on a challenging dataset.

Contribution

It introduces a combination of optical flow with three deep models for aerial human detection, comparing their effectiveness and speed on a challenging dataset.

Findings

01

Pretrained CNN achieved 98.09% accuracy.

02

S-CNN achieved 95.6% accuracy with softmax.

03

H-ELM trained in 445 seconds on CPU.

Abstract

Human detection in videos plays an important role in various real-life applications. Most traditional approaches depend on utilizing handcrafted features, which are problem-dependent and optimal for specific tasks. Moreover, they are highly susceptible to dynamical events such as illumination changes, camera jitter, and variations in object sizes. On the other hand, the proposed feature learning approaches are cheaper and easier because highly abstract and discriminative features can be produced automatically without the need of expert knowledge. In this paper, we utilize automatic feature learning methods, which combine optical flow and three different deep models (i.e., supervised convolutional neural network (S-CNN), pretrained CNN feature extractor, and hierarchical extreme learning machine) for human detection in videos captured using a nonstatic camera on an aerial platform with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.