DroneKey: Drone 3D Pose Estimation in Image Sequences using Gated Key-representation and Pose-adaptive Learning
Seo-Bin Hwang, Yeong-Jun Cho

TL;DR
DroneKey introduces a novel framework combining key-representation extraction and pose-adaptive learning for accurate, real-time 3D drone pose estimation from image sequences, addressing keypoint detection challenges.
Contribution
The paper presents DroneKey, a new method with a gated key-representation and pose-adaptive loss for improved drone keypoint detection and 3D pose estimation.
Findings
Achieves 99.68% AP in keypoint detection
Real-time processing at 44 FPS
High accuracy in 3D pose estimation with MAE-angle of 10.62°
Abstract
Estimating the 3D pose of a drone is important for anti-drone systems, but existing methods struggle with the unique challenges of drone keypoint detection. Drone propellers serve as keypoints but are difficult to detect due to their high visual similarity and diversity of poses. To address these challenges, we propose DroneKey, a framework that combines a 2D keypoint detector and a 3D pose estimator specifically designed for drones. In the keypoint detection stage, we extract two key-representations (intermediate and compact) from each transformer encoder layer and optimally combine them using a gated sum. We also introduce a pose-adaptive Mahalanobis distance in the loss function to ensure stable keypoint predictions across extreme poses. We built new datasets of drone 2D keypoints and 3D pose to train and evaluate our method, which have been publicly released. Experiments show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
