RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization

Niluthpol Chowdhury Mithun; Karan Sikka; Han-Pang Chiu; Supun; Samarasekera; Rakesh Kumar

arXiv:2009.05695·cs.CV·September 15, 2020

RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization

Niluthpol Chowdhury Mithun, Karan Sikka, Han-Pang Chiu, Supun, Samarasekera, Rakesh Kumar

PDF

1 Repo

TL;DR

This paper introduces a large-scale dataset and a novel joint embedding method for cross-modal visual localization, matching ground RGB images to aerial LIDAR data, significantly advancing scalability and performance.

Contribution

It presents the first large-scale dataset and a new embedding approach for effective cross-modal localization between RGB and LIDAR images.

Findings

01

Achieved median rank of 5 in large-scale cross-modal matching

02

Created a dataset with over 550K image pairs covering 143 km^2

03

Demonstrated improved performance over prior methods

Abstract

We study an important, yet largely unexplored problem of large-scale cross-modal visual localization by matching ground RGB images to a geo-referenced aerial LIDAR 3D point cloud (rendered as depth images). Prior works were demonstrated on small datasets and did not lend themselves to scaling up for large-scale applications. To enable large-scale evaluation, we introduce a new dataset containing over 550K pairs (covering 143 km^2 area) of RGB and aerial LIDAR depth images. We propose a novel joint embedding based method that effectively combines the appearance and semantic cues from both modalities to handle drastic cross-modal variations. Experiments on the proposed dataset show that our model achieves a strong result of a median rank of 5 in matching across a large test set of 50K location pairs collected from a 14km^2 area. This represents a significant advancement over prior works…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

niluthpol/RGB2LIDAR
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.