SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint   Learning

Yi Feng; Zizhan Guo; Qijun Chen; Rui Fan

arXiv:2407.05283·cs.CV·July 9, 2024

SCIPaD: Incorporating Spatial Clues into Unsupervised Pose-Depth Joint Learning

Yi Feng, Zizhan Guo, Qijun Chen, Rui Fan

PDF

Open Access 1 Repo

TL;DR

SCIPaD introduces a novel method that leverages spatial clues and a confidence-aware feature flow estimator to enhance unsupervised joint learning of depth and pose, significantly improving camera pose accuracy in complex scenarios.

Contribution

It proposes a new approach that incorporates spatial clues through a confidence-aware feature flow estimator and positional clue aggregator for robust pose-depth learning.

Findings

01

Achieves 22.2% reduction in translation error on KITTI dataset.

02

Achieves 34.8% reduction in angular error on KITTI dataset.

03

Outperforms state-of-the-art methods in unsupervised depth-pose estimation.

Abstract

Unsupervised monocular depth estimation frameworks have shown promising performance in autonomous driving. However, existing solutions primarily rely on a simple convolutional neural network for ego-motion recovery, which struggles to estimate precise camera poses in dynamic, complicated real-world scenarios. These inaccurately estimated camera poses can inevitably deteriorate the photometric reconstruction and mislead the depth estimation networks with wrong supervisory signals. In this article, we introduce SCIPaD, a novel approach that incorporates spatial clues for unsupervised depth-pose joint learning. Specifically, a confidence-aware feature flow estimator is proposed to acquire 2D feature positional translations and their associated confidence levels. Meanwhile, we introduce a positional clue aggregator, which integrates pseudo 3D point clouds from DepthNet and 2D feature flows…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

fengyi233/SCIPaD
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Advanced Vision and Imaging