D2S: Representing sparse descriptors and 3D coordinates for camera   relocalization

Bach-Thuan Bui; Huy-Hoang Bui; Dinh-Tuan Tran; and Joo-Ho Lee

arXiv:2307.15250·cs.CV·October 24, 2024·1 cites

D2S: Representing sparse descriptors and 3D coordinates for camera relocalization

Bach-Thuan Bui, Huy-Hoang Bui, Dinh-Tuan Tran, and Joo-Ho Lee

PDF

Open Access 1 Repo

TL;DR

D2S introduces a simple, cost-effective learning-based method for camera relocalization that uses a lightweight network to represent descriptors and scene coordinates, outperforming previous regression methods in various environments.

Contribution

The paper presents D2S, a novel approach that leverages a simple network with graph attention for efficient, scene-specific localization from a single RGB image, with improved generalization capabilities.

Findings

01

Outperforms previous regression-based methods in indoor and outdoor environments.

02

Effectively generalizes across day-night transitions and domain shifts.

03

Uses a lightweight model with selective attention for robust descriptor representation.

Abstract

State-of-the-art visual localization methods mostly rely on complex procedures to match local descriptors and 3D point clouds. However, these procedures can incur significant costs in terms of inference, storage, and updates over time. In this study, we propose a direct learning-based approach that utilizes a simple network named D2S to represent complex local descriptors and their scene coordinates. Our method is characterized by its simplicity and cost-effectiveness. It solely leverages a single RGB image for localization during the testing phase and only requires a lightweight model to encode a complex sparse scene. The proposed D2S employs a combination of a simple loss function and graph attention to selectively focus on robust descriptors while disregarding areas such as clouds, trees, and several dynamic objects. This selective attention enables D2S to effectively perform a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ais-lab/d2s
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging

MethodsFocus