ImLPR: Image-based LiDAR Place Recognition using Vision Foundation Models
Minwoo Jung, Lanke Frank Tarimo Fu, Maurice Fallon, Ayoung Kim

TL;DR
ImLPR introduces a novel pipeline that leverages pre-trained Vision Foundation Models to improve LiDAR-based place recognition by converting point clouds into Range Image Views and employing specialized learning techniques.
Contribution
It is the first to utilize a Vision Foundation Model for LiDAR place recognition, converting raw point clouds into Range Image Views to effectively harness pre-trained visual features.
Findings
Outperforms state-of-the-art methods on public datasets.
Effective in both intra- and inter-session LPR scenarios.
Demonstrates the benefit of using pre-trained VFMs in LiDAR domain.
Abstract
LiDAR Place Recognition (LPR) is a key component in robotic localization, enabling robots to align current scans with prior maps of their environment. While Visual Place Recognition (VPR) has embraced Vision Foundation Models (VFMs) to enhance descriptor robustness, LPR has relied on task-specific models with limited use of pre-trained foundation-level knowledge. This is due to the lack of 3D foundation models and the challenges of using VFM with LiDAR point clouds. To tackle this, we introduce ImLPR, a novel pipeline that employs a pre-trained DINOv2 VFM to generate rich descriptors for LPR. To the best of our knowledge, ImLPR is the first method to utilize a VFM for LPR while retaining the majority of pre-trained knowledge. ImLPR converts raw point clouds into novel three-channel Range Image Views (RIV) to leverage VFM in the LiDAR domain. It employs MultiConv adapters and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
