CoL3D: Collaborative Learning of Single-view Depth and Camera Intrinsics   for Metric 3D Shape Recovery

Chenghao Zhang; Lubin Fan; Shen Cao; Bojian Wu; Jieping Ye

arXiv:2502.08902·cs.CV·February 14, 2025

CoL3D: Collaborative Learning of Single-view Depth and Camera Intrinsics for Metric 3D Shape Recovery

Chenghao Zhang, Lubin Fan, Shen Cao, Bojian Wu, Jieping Ye

PDF

Open Access

TL;DR

This paper introduces CoL3D, a collaborative learning framework that jointly estimates depth and camera intrinsics from a single image, enabling accurate metric 3D shape recovery crucial for robotics.

Contribution

It proposes a unified network with collaborative optimization for depth, intrinsics, and 3D shapes, including a novel canonical incidence field mechanism and shape similarity loss.

Findings

01

Achieves state-of-the-art depth and camera calibration accuracy

02

Produces high-quality 3D shapes for robotic perception

03

Performs well on diverse indoor and outdoor datasets

Abstract

Recovering the metric 3D shape from a single image is particularly relevant for robotics and embodied intelligence applications, where accurate spatial understanding is crucial for navigation and interaction with environments. Usually, the mainstream approaches achieve it through monocular depth estimation. However, without camera intrinsics, the 3D metric shape can not be recovered from depth alone. In this study, we theoretically demonstrate that depth serves as a 3D prior constraint for estimating camera intrinsics and uncover the reciprocal relations between these two elements. Motivated by this, we propose a collaborative learning framework for jointly estimating depth and camera intrinsics, named CoL3D, to learn metric 3D shapes from single images. Specifically, CoL3D adopts a unified network and performs collaborative optimization at three levels: depth, camera intrinsics, and 3D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis · Advanced Vision and Imaging · Industrial Vision Systems and Defect Detection