PiLoT: Neural Pixel-to-3D Registration for UAV-based Ego and Target Geo-localization
Xiaoya Cheng, Long Wang, Yan Liu, Xinyi Liu, Hanlin Tan, Yu Liu, Maojun Zhang, Shen Yan

TL;DR
PiLoT is a real-time UAV localization framework that directly registers video against a 3D map, outperforming traditional methods especially in GNSS-denied environments.
Contribution
The paper introduces a unified neural framework with a dual-thread engine, synthetic dataset, and neural-guided optimizer for robust UAV geo-localization.
Findings
PiLoT achieves state-of-the-art accuracy on multiple benchmarks.
Runs at over 25 FPS on NVIDIA Jetson Orin.
Generalizes from synthetic to real data in zero-shot manner.
Abstract
We present PiLoT, a unified framework that tackles UAV-based ego and target geo-localization. Conventional approaches rely on decoupled pipelines that fuse GNSS and Visual-Inertial Odometry (VIO) for ego-pose estimation, and active sensors like laser rangefinders for target localization. However, these methods are susceptible to failure in GNSS-denied environments and incur substantial hardware costs and complexity. PiLoT breaks this paradigm by directly registering live video stream against a geo-referenced 3D map. To achieve robust, accurate, and real-time performance, we introduce three key contributions: 1) a Dual-Thread Engine that decouples map rendering from core localization thread, ensuring both low latency while maintaining drift-free accuracy; 2) a large-scale synthetic dataset with precise geometric annotations (camera pose, depth maps). This dataset enables the training of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
