CVLNet: Cross-View Semantic Correspondence Learning for Video-based   Camera Localization

Yujiao Shi; Xin Yu; Shan Wang; Hongdong Li

arXiv:2208.03660·cs.CV·August 9, 2022·1 cites

CVLNet: Cross-View Semantic Correspondence Learning for Video-based Camera Localization

Yujiao Shi, Xin Yu, Shan Wang, Hongdong Li

PDF

Open Access

TL;DR

This paper introduces CVLNet, a novel method for cross-view video-based camera localization that aligns ground and overhead images, estimates camera displacement, and leverages video sequences for improved accuracy, validated on a new KITTI-CVL dataset.

Contribution

The paper proposes CVLNet, a new framework that bridges domain gaps in cross-view localization using geometric projection and displacement estimation, enhancing video-based localization accuracy.

Findings

01

Video-based localization outperforms single image methods.

02

CVLNet effectively aligns ground and satellite images.

03

Proposed modules outperform alternative approaches.

Abstract

This paper tackles the problem of Cross-view Video-based camera Localization (CVL). The task is to localize a query camera by leveraging information from its past observations, i.e., a continuous sequence of images observed at previous time stamps, and matching them to a large overhead-view satellite image. The critical challenge of this task is to learn a powerful global feature descriptor for the sequential ground-view images while considering its domain alignment with reference satellite images. For this purpose, we introduce CVLNet, which first projects the sequential ground-view images into an overhead view by exploring the ground-and-overhead geometric correspondences and then leverages the photo consistency among the projected images to form a global representation. In this way, the cross-view domain differences are bridged. Since the reference satellite images are usually…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Robotics and Sensor-Based Localization