2D-3D Pose Tracking with Multi-View Constraints

Huai Yu; Kuangyi Chen; Wen Yang; Sebastian Scherer; Gui-Song Xia

arXiv:2309.11335·cs.RO·October 28, 2024

2D-3D Pose Tracking with Multi-View Constraints

Huai Yu, Kuangyi Chen, Wen Yang, Sebastian Scherer, Gui-Song Xia

PDF

Open Access

TL;DR

This paper introduces a multi-view 2D-3D pose tracking framework that leverages adjacent frame relationships and multi-view constraints to improve camera localization accuracy in 3D LiDAR maps.

Contribution

It proposes a novel 2D-3D pose tracking method combining a hybrid flow network and multi-view geometrical constraints with a cross-modal consistency loss.

Findings

01

Outperforms existing frame-by-frame 2D-3D pose tracking methods

02

Achieves state-of-the-art results on KITTI and Argoverse datasets

03

Demonstrates improved stability and accuracy in camera localization

Abstract

Camera localization in 3D LiDAR maps has gained increasing attention due to its promising ability to handle complex scenarios, surpassing the limitations of visual-only localization methods. However, existing methods mostly focus on addressing the cross-modal gaps, estimating camera poses frame by frame without considering the relationship between adjacent frames, which makes the pose tracking unstable. To alleviate this, we propose to couple the 2D-3D correspondences between adjacent frames using the 2D-2D feature matching, establishing the multi-view geometrical constraints for simultaneously estimating multiple camera poses. Specifically, we propose a new 2D-3D pose tracking framework, which consists: a front-end hybrid flow estimation network for consecutive frames and a back-end pose optimization module. We further design a cross-modal consistency-based loss to incorporate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Human Pose and Action Recognition