RayMVSNet++: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo
Yifei Shi, Junhua Xi, Dewen Hu, Zhiping Cai, Kai Xu

TL;DR
RayMVSNet++ introduces a lightweight, ray-based implicit field approach for multi-view stereo, leveraging sequential prediction and transformer features to improve depth accuracy and handle challenging scenarios effectively.
Contribution
The paper proposes a novel ray-based depth optimization method using 1D implicit fields and transformer features, significantly reducing computation and enhancing accuracy over traditional cost volume methods.
Findings
Ranks top on DTU and Tanks & Temples datasets
Achieves state-of-the-art performance on ScanNet
Effective in textured, occluded, and large depth variation scenes
Abstract
Learning-based multi-view stereo (MVS) has by far centered around 3D convolution on cost volumes. Due to the high computation and memory consumption of 3D CNN, the resolution of output depth is often considerably limited. Different from most existing works dedicated to adaptive refinement of cost volumes, we opt to directly optimize the depth value along each camera ray, mimicking the range finding of a laser scanner. This reduces the MVS problem to ray-based depth optimization which is much more light-weight than full cost volume optimization. In particular, we propose RayMVSNet which learns sequential prediction of a 1D implicit field along each camera ray with the zero-crossing point indicating scene depth. This sequential modeling, conducted based on transformer features, essentially learns the epipolar line search in traditional multi-view stereo. We devise a multi-task learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Processing Techniques and Applications
Methods3 Dimensional Convolutional Neural Network · Convolution · 3D Convolution · OPT
