TL;DR
This paper introduces a two-stage segmentation-based method with spatio-temporal attention for detecting small, fast-moving drones in videos, addressing challenges like occlusion and motion complexity.
Contribution
It proposes a novel approach combining pyramid pooling, attention mechanisms, and 3D convolution features for improved drone detection in challenging scenarios.
Findings
Outperforms several baseline methods on public datasets.
Effectively captures contextual and motion information for small object detection.
Demonstrates robustness against occlusion and complex drone movements.
Abstract
As airborne vehicles are becoming more autonomous and ubiquitous, it has become vital to develop the capability to detect the objects in their surroundings. This paper attempts to address the problem of drones detection from other flying drones. The erratic movement of the source and target drones, small size, arbitrary shape, large intensity variations, and occlusion make this problem quite challenging. In this scenario, region-proposal based methods are not able to capture sufficient discriminative foreground-background information. Also, due to the extremely small size and complex motion of the source and target drones, feature aggregation based methods are unable to perform well. To handle this, instead of using region-proposal based methods, we propose to use a two-stage segmentation-based approach employing spatio-temporal attention cues. During the first stage, given the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods3D Convolution · Convolution
