Multi-View Stereo Using Perspective-Aware Features and Metadata to Improve Cost Volume
Zongcheng Zuo, Yuanxiang Li, Yu Zhou, Fan Mo

TL;DR
This paper introduces a new method for 3D reconstruction from images that improves accuracy in challenging areas like reflections and texture-less surfaces.
Contribution
The novel approach combines perspective-aware convolutions and metadata to enhance feature matching in multi-view stereo.
Findings
PAC-MVSNet improves feature matching in texture-less and reflective regions using perspective-aware convolutions.
The method integrates metadata like camera pose distance to guide geometric reasoning during cost aggregation.
The proposed network outperforms existing methods on multiple benchmark datasets.
Abstract
Feature matching is pivotal when using multi-view stereo (MVS) to reconstruct dense 3D models from calibrated images. This paper proposes PAC-MVSNet, which integrates perspective-aware convolution (PAC) and metadata-enhanced cost volumes to address the challenges in reflective and texture-less regions. PAC dynamically aligns convolutional kernels with scene perspective lines, while the use of metadata (e.g., camera pose distance) enables geometric reasoning during cost aggregation. In PAC-MVSNet, we introduce feature matching with long-range tracking that utilizes both internal and external focuses to integrate extensive contextual data within individual images as well as across multiple images. To enhance the performance of the feature matching with long-range tracking, we also propose a perspective-aware convolution module that directs the convolutional kernel to capture features…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · 3D Surveying and Cultural Heritage
