HE-VPR: Height Estimation Enabled Aerial Visual Place Recognition Against Scale Variance
Mengfan He, Xingyu Shao, Chunyu Li, Chao Chen, Liangzheng Sun, Ziyang Meng, Yuanqing Wu

TL;DR
HE-VPR introduces a height-aware visual place recognition framework that improves accuracy and efficiency by decoupling height estimation from place recognition, utilizing lightweight modules and a novel masking strategy.
Contribution
The paper presents a novel height estimation enabled VPR system that reduces training costs and search space, enhancing robustness against scale differences in aerial imagery.
Findings
Up to 6.1% improvement in Recall@1 over state-of-the-art baselines.
Memory usage reduced by up to 90%.
Demonstrated effectiveness on challenging multi-altitude datasets.
Abstract
In this work, we propose HE-VPR, a visual place recognition (VPR) framework that incorporates height estimation. Our system decouples height inference from place recognition, allowing both modules to share a frozen DINOv2 backbone. Two lightweight bypass adapter branches are integrated into our system. The first estimates the height partition of the query image via retrieval from a compact height database, and the second performs VPR within the corresponding height-specific sub-database. The adaptation design reduces training cost and significantly decreases the search space of the database. We also adopt a center-weighted masking strategy to further enhance the robustness against scale differences. Experiments on two self-collected challenging multi-altitude datasets demonstrate that HE-VPR achieves up to 6.1\% Recall@1 improvement over state-of-the-art ViT-based baselines and reduces…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Neural Network Applications
