RelMobNet: End-to-end relative camera pose estimation using a robust   two-stage training

Praveen Kumar Rajendran; Sumit Mishra; Luiz Felipe Vecchietti; Dongsoo; Har

arXiv:2202.12838·cs.CV·July 12, 2022

RelMobNet: End-to-end relative camera pose estimation using a robust two-stage training

Praveen Kumar Rajendran, Sumit Mishra, Luiz Felipe Vecchietti, Dongsoo, Har

PDF

Open Access

TL;DR

This paper introduces RelMobNet, an end-to-end siamese network for relative camera pose estimation that improves accuracy and generalization through a novel two-stage training process, outperforming existing CNN-based methods.

Contribution

The paper proposes a new two-stage training approach for a siamese network that enhances translation accuracy and generalization in relative pose estimation without relying on camera parameters.

Findings

01

Improves translation vector estimation by up to 52.27% on certain scenes.

02

Demonstrates better generalization across different scene styles using GAN-based augmentation.

03

Provides qualitative analysis of epipolar lines aligning with ground truth poses.

Abstract

Relative camera pose estimation, i.e. estimating the translation and rotation vectors using a pair of images taken in different locations, is an important part of systems in augmented reality and robotics. In this paper, we present an end-to-end relative camera pose estimation network using a siamese architecture that is independent of camera parameters. The network is trained using the Cambridge Landmarks data with four individual scene datasets and a dataset combining the four scenes. To improve generalization, we propose a novel two-stage training that alleviates the need of a hyperparameter to balance the translation and rotation loss scale. The proposed method is compared with one-stage training CNN-based methods such as RPNet and RCPNet and demonstrate that the proposed model improves translation vector estimation by 16.11%, 28.88%, and 52.27% on the Kings College, Old Hospital,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques

MethodsSiamese Network