BEV$^2$PR: BEV-Enhanced Visual Place Recognition with Structural Cues
Fudong Ge, Yiwei Zhang, Shuhan Shen, Yue Wang, Weiming Hu, Jin Gao

TL;DR
This paper introduces BEV$^2$PR, a novel visual place recognition framework that leverages structural cues from bird's-eye view generated by a single camera, improving recognition accuracy without additional sensors.
Contribution
The paper presents a new BEV-enhanced VPR framework that combines visual and structural features from a monocular camera, addressing sensor cost and data alignment challenges in existing methods.
Findings
Achieved 2.47% improvement in Recall@1 over baseline
Realized 18.06% gain on hard dataset subset
Demonstrated consistent performance improvements across multiple modules
Abstract
In this paper, we propose a new image-based visual place recognition (VPR) framework by exploiting the structural cues in bird's-eye view (BEV) from a single monocular camera. The motivation arises from two key observations about place recognition methods based on both appearance and structure: 1) For the methods relying on LiDAR sensors, the integration of LiDAR in robotic systems has led to increased expenses, while the alignment of data between different sensors is also a major challenge. 2) Other image-/camera-based methods, involving integrating RGB images and their derived variants (eg, pseudo depth images, pseudo 3D point clouds), exhibit several limitations, such as the failure to effectively exploit the explicit spatial relationships between different objects. To tackle the above issues, we design a new BEV-enhanced VPR framework, namely BEVPR, generating a composite…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Gaze Tracking and Assistive Technology · Advanced Image and Video Retrieval Techniques
