Metric3Dv2: A Versatile Monocular Geometric Foundation Model for   Zero-shot Metric Depth and Surface Normal Estimation

Mu Hu; Wei Yin; Chi Zhang; Zhipeng Cai; Xiaoxiao Long; Kaixuan Wang,; Hao Chen; Gang Yu; Chunhua Shen; Shaojie Shen

arXiv:2404.15506·cs.CV·January 6, 2025

Metric3Dv2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation

Mu Hu, Wei Yin, Chi Zhang, Zhipeng Cai, Xiaoxiao Long, Kaixuan Wang,, Hao Chen, Gang Yu, Chunhua Shen, Shaojie Shen

PDF

1 Repo

TL;DR

Metric3D v2 introduces a versatile monocular geometric foundation model that achieves zero-shot metric depth and surface normal estimation from a single image, enabling accurate 3D recovery without task-specific training.

Contribution

It proposes a canonical camera space transformation and joint depth-normal optimization modules, allowing stable training on large-scale diverse data for zero-shot generalization.

Findings

01

Achieves zero-shot metric depth and normal estimation on in-the-wild images.

02

Trained on over 16 million images from diverse camera models.

03

Enables plausible single-image 3D metrology.

Abstract

We introduce Metric3D v2, a geometric foundation model for zero-shot metric depth and surface normal estimation from a single image, which is crucial for metric 3D recovery. While depth and normal are geometrically related and highly complimentary, they present distinct challenges. SoTA monocular depth methods achieve zero-shot generalization by learning affine-invariant depths, which cannot recover real-world metrics. Meanwhile, SoTA normal estimation methods have limited zero-shot performance due to the lack of large-scale labeled data. To tackle these issues, we propose solutions for both metric depth estimation and surface normal estimation. For metric depth estimation, we show that the key to a zero-shot single-view model lies in resolving the metric ambiguity from various camera models and large-scale data training. We propose a canonical camera space transformation module, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yvanyin/metric3d
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.