FuTH-Net: Fusing Temporal Relations and Holistic Features for Aerial Video Classification
Pu Jin, Lichao Mou, Yuansheng Hua, Gui-Song Xia, Xiao Xiang Zhu

TL;DR
FuTH-Net is a deep neural network that effectively combines holistic features and multi-scale temporal relations to improve aerial video classification, capturing long-term dependencies often missed by previous methods.
Contribution
The paper introduces FuTH-Net, a novel two-pathway architecture with a fusion module that models both holistic features and multi-scale temporal relations for aerial video analysis.
Findings
Achieves state-of-the-art results on ERA and Drone-Action datasets.
Effectively captures long-term temporal dependencies.
Demonstrates strong generalization across different recognition tasks.
Abstract
Unmanned aerial vehicles (UAVs) are now widely applied to data acquisition due to its low cost and fast mobility. With the increasing volume of aerial videos, the demand for automatically parsing these videos is surging. To achieve this, current researches mainly focus on extracting a holistic feature with convolutions along both spatial and temporal dimensions. However, these methods are limited by small temporal receptive fields and cannot adequately capture long-term temporal dependencies which are important for describing complicated dynamics. In this paper, we propose a novel deep neural network, termed FuTH-Net, to model not only holistic features, but also temporal relations for aerial video classification. Furthermore, the holistic features are refined by the multi-scale temporal relations in a novel fusion module for yielding more discriminative video representations. More…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Video Surveillance and Tracking Methods
