# A Light Multi-View Stereo Method with Patch-Uncertainty Awareness

**Authors:** Zhen Liu, Guangzheng Wu, Tao Xie, Shilong Li, Chao Wu, Zhiming Zhang, Jiali Zhou

PMC · DOI: 10.3390/s24041293 · Sensors (Basel, Switzerland) · 2024-02-17

## TL;DR

This paper introduces a new multi-view stereo method that improves 3D reconstruction accuracy by using attention mechanisms and adaptive depth sampling.

## Contribution

The novel approach combines a lightweight feature pyramid network with adaptive depth sampling and edge detection for better 3D reconstruction.

## Key findings

- A lightweight feature pyramid network enhances coarse-stage features for better initial depth estimation.
- A patch-uncertainty-based depth sampling strategy improves reconstruction accuracy during optimization.
- The method achieves competitive performance with low GPU memory usage on benchmark datasets.

## Abstract

Multi-view stereo methods utilize image sequences from different views to generate a 3D point cloud model of the scene. However, existing approaches often overlook coarse-stage features, impacting the final reconstruction accuracy. Moreover, using a fixed range for all the pixels during inverse depth sampling can adversely affect depth estimation. To address these challenges, we present a novel learning-based multi-view stereo method incorporating attention mechanisms and an adaptive depth sampling strategy. Firstly, we propose a lightweight, coarse-feature-enhanced feature pyramid network in the feature extraction stage, augmented by a coarse-feature-enhanced module. This module integrates features with channel and spatial attention, enriching the contextual features that are crucial for the initial depth estimation. Secondly, we introduce a novel patch-uncertainty-based depth sampling strategy for depth refinement, dynamically configuring depth sampling ranges within the GRU-based optimization process. Furthermore, we incorporate an edge detection operator to extract edge features from the reference image’s feature map. These edge features are additionally integrated into the iterative cost volume construction, enhancing the reconstruction accuracy. Lastly, our method is rigorously evaluated on the DTU and Tanks and Temples benchmark datasets, revealing its low GPU memory consumption and competitive reconstruction quality compared to other learning-based MVS methods.

## Full-text entities

- **Diseases:** injury to people or property (MESH:C000719191)
- **Chemicals:** DTU (-)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC10892961/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC10892961/full.md

## References

48 references — full list in the complete paper: https://tomesphere.com/paper/PMC10892961/full.md

---
Source: https://tomesphere.com/paper/PMC10892961