FAR: Fourier Aerial Video Recognition
Divya Kothandaraman, Tianrui Guan, Xijun Wang, Sean Hu, Ming Lin,, Dinesh Manocha

TL;DR
FAR introduces a Fourier-based method for UAV video activity recognition that effectively disentangles human actions from backgrounds and models spatial-temporal dependencies with reduced computation, achieving significant accuracy improvements.
Contribution
The paper proposes a novel Fourier object disentanglement and Fourier Attention mechanism for UAV video recognition, enhancing accuracy and efficiency over existing methods.
Findings
Achieves 8.02% to 38.69% higher top-1 accuracy on UAV datasets.
Operates up to 3 times faster than prior methods.
Effectively separates human actions from backgrounds in aerial videos.
Abstract
We present an algorithm, Fourier Activity Recognition (FAR), for UAV video activity recognition. Our formulation uses a novel Fourier object disentanglement method to innately separate out the human agent (which is typically small) from the background. Our disentanglement technique operates in the frequency domain to characterize the extent of temporal change of spatial pixels, and exploits convolution-multiplication properties of Fourier transform to map this representation to the corresponding object-background entangled features obtained from the network. To encapsulate contextual information and long-range space-time dependencies, we present a novel Fourier Attention algorithm, which emulates the benefits of self-attention by modeling the weighted outer product in the frequency domain. Our Fourier attention formulation uses much fewer computations than self-attention. We have…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
