RayMVSNet: Learning Ray-based 1D Implicit Fields for Accurate Multi-View Stereo
Junhua Xi, Yifei Shi, Yijie Wang, Yulan Guo, Kai Xu

TL;DR
RayMVSNet introduces a lightweight, ray-based depth optimization approach for multi-view stereo, utilizing 1D implicit fields and transformer features to improve accuracy and efficiency over traditional 3D CNN methods.
Contribution
The paper proposes a novel ray-based depth prediction method using 1D implicit fields and transformers, reducing computational costs and enhancing accuracy in multi-view stereo.
Findings
Achieves top performance on DTU dataset with 0.33mm accuracy.
Outperforms previous learning-based methods on Tanks & Temples with 59.48% f-score.
Reduces memory and computation compared to 3D CNN approaches.
Abstract
Learning-based multi-view stereo (MVS) has by far centered around 3D convolution on cost volumes. Due to the high computation and memory consumption of 3D CNN, the resolution of output depth is often considerably limited. Different from most existing works dedicated to adaptive refinement of cost volumes, we opt to directly optimize the depth value along each camera ray, mimicking the range (depth) finding of a laser scanner. This reduces the MVS problem to ray-based depth optimization which is much more light-weight than full cost volume optimization. In particular, we propose RayMVSNet which learns sequential prediction of a 1D implicit field along each camera ray with the zero-crossing point indicating scene depth. This sequential modeling, conducted based on transformer features, essentially learns the epipolar line search in traditional multi-view stereo. We also devise a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques
Methods3 Dimensional Convolutional Neural Network · 3D Convolution · Convolution
