MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network

Jianfei Jiang; Qiankun Liu; Haochen Yu; Hongyuan Liu; Liyong Wang; Jiansheng Chen; Huimin Ma

arXiv:2507.11333·cs.CV·July 16, 2025

MonoMVSNet: Monocular Priors Guided Multi-View Stereo Network

Jianfei Jiang, Qiankun Liu, Haochen Yu, Hongyuan Liu, Liyong Wang, Jiansheng Chen, Huimin Ma

PDF

Open Access 1 Repo

TL;DR

MonoMVSNet leverages monocular priors to enhance multi-view stereo depth estimation, especially in challenging regions, achieving state-of-the-art results on standard datasets by integrating monocular features and depths into the multi-view framework.

Contribution

The paper introduces MonoMVSNet, a novel MVS network that incorporates monocular priors through attention and dynamic depth updates, improving robustness and accuracy in difficult regions.

Findings

01

Achieves state-of-the-art performance on DTU and Tanks-and-Temples datasets.

02

Effectively handles textureless and reflective regions.

03

First to integrate monocular priors into multi-view stereo in this manner.

Abstract

Learning-based Multi-View Stereo (MVS) methods aim to predict depth maps for a sequence of calibrated images to recover dense point clouds. However, existing MVS methods often struggle with challenging regions, such as textureless regions and reflective surfaces, where feature matching fails. In contrast, monocular depth estimation inherently does not require feature matching, allowing it to achieve robust relative depth estimation in these regions. To bridge this gap, we propose MonoMVSNet, a novel monocular feature and depth guided MVS network that integrates powerful priors from a monocular foundation model into multi-view geometry. Firstly, the monocular feature of the reference view is integrated into source view features by the attention mechanism with a newly designed cross-view position encoding. Then, the monocular depth of the reference view is aligned to dynamically update…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jianfeij/monomvsnet
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMedical Image Segmentation Techniques · Advanced Vision and Imaging · Cell Image Analysis Techniques