Fast 3D point clouds retrieval for Large-scale 3D Place Recognition
Chahine-Nicolas Zede, Laurent Carrafa, Val\'erie Gouet-Brunet

TL;DR
This paper introduces a novel method for fast large-scale 3D point cloud retrieval by adapting a transformer-based Differentiable Search Index, enabling constant-time retrieval for 3D place recognition tasks.
Contribution
It presents a new approach that leverages Vision Transformers and DSI to significantly accelerate 3D point cloud retrieval, outperforming existing methods in speed and accuracy.
Findings
Achieves constant-time retrieval in large-scale 3D point cloud datasets.
Outperforms state-of-the-art methods in place recognition accuracy.
Demonstrates significant speed improvements without sacrificing quality.
Abstract
Retrieval in 3D point clouds is a challenging task that consists in retrieving the most similar point clouds to a given query within a reference of 3D points. Current methods focus on comparing descriptors of point clouds in order to identify similar ones. Due to the complexity of this latter step, here we focus on the acceleration of the retrieval by adapting the Differentiable Search Index (DSI), a transformer-based approach initially designed for text information retrieval, for 3D point clouds retrieval. Our approach generates 1D identifiers based on the point descriptors, enabling direct retrieval in constant time. To adapt DSI to 3D data, we integrate Vision Transformers to map descriptors to these identifiers while incorporating positional and semantic encoding. The approach is evaluated for place recognition on a public benchmark comparing its retrieval capabilities against…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · 3D Shape Modeling and Analysis · Advanced Image and Video Retrieval Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus
