DroneAttention: Sparse Weighted Temporal Attention for Drone-Camera Based Activity Recognition
Santosh Kumar Yadav, Achleshwar Luthra, Esha Pahwa, Kamlesh Tiwari,, Heena Rathore, Hari Mohan Pandey, Peter Corcoran

TL;DR
This paper introduces a novel Sparse Weighted Temporal Attention (SWTA) module for drone-based human activity recognition, improving accuracy by effectively utilizing sparsely sampled frames and combining optical flow with RGB data.
Contribution
The paper presents a new SWTA module that enhances existing CNN architectures for activity recognition by capturing temporal information without needing separate temporal streams.
Findings
Achieved state-of-the-art accuracy on three benchmark datasets.
Surpassed previous methods by significant margins on all datasets.
Demonstrated the effectiveness of sparse sampling combined with weighted attention.
Abstract
Human activity recognition (HAR) using drone-mounted cameras has attracted considerable interest from the computer vision research community in recent years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Attention (SWTA) module to utilize sparsely sampled video frames for obtaining global weighted temporal attention. The proposed SWTA is comprised of two parts. First, temporal segment network that sparsely samples a given set of frames. Second, weighted temporal attention, which incorporates a fusion of attention maps derived from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Surveillance and Tracking Methods · Human Pose and Action Recognition · Anomaly Detection Techniques and Applications
