RayMVSNet++: Learning Ray-based 1D Implicit Fields for Accurate   Multi-View Stereo

Yifei Shi; Junhua Xi; Dewen Hu; Zhiping Cai; Kai Xu

arXiv:2307.10233·cs.CV·July 21, 2023

RayMVSNet++: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo

Yifei Shi, Junhua Xi, Dewen Hu, Zhiping Cai, Kai Xu

PDF

Open Access

TL;DR

RayMVSNet++ introduces a lightweight, ray-based implicit field approach for multi-view stereo, leveraging sequential prediction and transformer features to improve depth accuracy and handle challenging scenarios effectively.

Contribution

The paper proposes a novel ray-based depth optimization method using 1D implicit fields and transformer features, significantly reducing computation and enhancing accuracy over traditional cost volume methods.

Findings

01

Ranks top on DTU and Tanks & Temples datasets

02

Achieves state-of-the-art performance on ScanNet

03

Effective in textured, occluded, and large depth variation scenes

Abstract

Learning-based multi-view stereo (MVS) has by far centered around 3D convolution on cost volumes. Due to the high computation and memory consumption of 3D CNN, the resolution of output depth is often considerably limited. Different from most existing works dedicated to adaptive refinement of cost volumes, we opt to directly optimize the depth value along each camera ray, mimicking the range finding of a laser scanner. This reduces the MVS problem to ray-based depth optimization which is much more light-weight than full cost volume optimization. In particular, we propose RayMVSNet which learns sequential prediction of a 1D implicit field along each camera ray with the zero-crossing point indicating scene depth. This sequential modeling, conducted based on transformer features, essentially learns the epipolar line search in traditional multi-view stereo. We devise a multi-task learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Processing Techniques and Applications

Methods3 Dimensional Convolutional Neural Network · Convolution · 3D Convolution · OPT