RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View   Stereo

Junhua Xi; Yifei Shi; Yijie Wang; Yulan Guo; Kai Xu

arXiv:2204.01320·cs.CV·April 5, 2022·1 cites

RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo

Junhua Xi, Yifei Shi, Yijie Wang, Yulan Guo, Kai Xu

PDF

Open Access

TL;DR

RayMVSNet introduces a lightweight, ray-based depth optimization approach for multi-view stereo, utilizing 1D implicit fields and transformer features to improve accuracy and efficiency over traditional 3D CNN methods.

Contribution

The paper proposes a novel ray-based depth prediction method using 1D implicit fields and transformers, reducing computational costs and enhancing accuracy in multi-view stereo.

Findings

01

Achieves top performance on DTU dataset with 0.33mm accuracy.

02

Outperforms previous learning-based methods on Tanks & Temples with 59.48% f-score.

03

Reduces memory and computation compared to 3D CNN approaches.

Abstract

Learning-based multi-view stereo (MVS) has by far centered around 3D convolution on cost volumes. Due to the high computation and memory consumption of 3D CNN, the resolution of output depth is often considerably limited. Different from most existing works dedicated to adaptive refinement of cost volumes, we opt to directly optimize the depth value along each camera ray, mimicking the range (depth) finding of a laser scanner. This reduces the MVS problem to ray-based depth optimization which is much more light-weight than full cost volume optimization. In particular, we propose RayMVSNet which learns sequential prediction of a 1D implicit field along each camera ray with the zero-crossing point indicating scene depth. This sequential modeling, conducted based on transformer features, essentially learns the epipolar line search in traditional multi-view stereo. We also devise a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques

Methods3 Dimensional Convolutional Neural Network · 3D Convolution · Convolution