Fast 3D point clouds retrieval for Large-scale 3D Place Recognition

Chahine-Nicolas Zede; Laurent Carrafa; Val\'erie Gouet-Brunet

arXiv:2502.21067·cs.CV·May 29, 2025

Fast 3D point clouds retrieval for Large-scale 3D Place Recognition

Chahine-Nicolas Zede, Laurent Carrafa, Val\'erie Gouet-Brunet

PDF

Open Access

TL;DR

This paper introduces a novel method for fast large-scale 3D point cloud retrieval by adapting a transformer-based Differentiable Search Index, enabling constant-time retrieval for 3D place recognition tasks.

Contribution

It presents a new approach that leverages Vision Transformers and DSI to significantly accelerate 3D point cloud retrieval, outperforming existing methods in speed and accuracy.

Findings

01

Achieves constant-time retrieval in large-scale 3D point cloud datasets.

02

Outperforms state-of-the-art methods in place recognition accuracy.

03

Demonstrates significant speed improvements without sacrificing quality.

Abstract

Retrieval in 3D point clouds is a challenging task that consists in retrieving the most similar point clouds to a given query within a reference of 3D points. Current methods focus on comparing descriptors of point clouds in order to identify similar ones. Due to the complexity of this latter step, here we focus on the acceleration of the retrieval by adapting the Differentiable Search Index (DSI), a transformer-based approach initially designed for text information retrieval, for 3D point clouds retrieval. Our approach generates 1D identifiers based on the point descriptors, enabling direct retrieval in constant time. To adapt DSI to 3D data, we integrate Vision Transformers to map descriptors to these identifiers while incorporating positional and semantic encoding. The approach is evaluated for place recognition on a public benchmark comparing its retrieval capabilities against…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · 3D Shape Modeling and Analysis · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Focus