Range and Bird's Eye View Fused Cross-Modal Visual Place Recognition
Jianyi Peng, Fan Lu, Bin Li, Yuan Huang, Sanqing Qu, Guang Chen

TL;DR
This paper presents a novel cross-modal visual place recognition method combining range and Bird's Eye View images, improving accuracy by a new retrieval and re-ranking approach with a specialized supervision technique.
Contribution
It introduces an efficient retrieval + re-rank framework that fuses range and BEV images and employs a novel similarity supervision method to enhance training with limited data.
Findings
Outperforms state-of-the-art on KITTI dataset
Effective fusion of range and BEV images improves recognition accuracy
Novel similarity supervision enhances training efficiency
Abstract
Image-to-point cloud cross-modal Visual Place Recognition (VPR) is a challenging task where the query is an RGB image, and the database samples are LiDAR point clouds. Compared to single-modal VPR, this approach benefits from the widespread availability of RGB cameras and the robustness of point clouds in providing accurate spatial geometry and distance information. However, current methods rely on intermediate modalities that capture either the vertical or horizontal field of view, limiting their ability to fully exploit the complementary information from both sensors. In this work, we propose an innovative initial retrieval + re-rank method that effectively combines information from range (or RGB) images and Bird's Eye View (BEV) images. Our approach relies solely on a computationally efficient global descriptor similarity search process to achieve re-ranking. Additionally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Video Surveillance and Tracking Methods · Advanced Image and Video Retrieval Techniques
