Revisiting PatchMatch Multi-View Stereo for Urban 3D Reconstruction

Marco Orsingher; Paolo Zani; Paolo Medici; Massimo Bertozzi

arXiv:2207.08439·cs.CV·July 19, 2022

Revisiting PatchMatch Multi-View Stereo for Urban 3D Reconstruction

Marco Orsingher, Paolo Zani, Paolo Medici, Massimo Bertozzi

PDF

Open Access

TL;DR

This paper presents an enhanced PatchMatch Multi-View Stereo pipeline for urban 3D reconstruction, integrating SLAM, novel loss functions, and global refinement to achieve state-of-the-art results on the KITTI dataset.

Contribution

It introduces a comprehensive urban 3D reconstruction pipeline that combines SLAM initialization, a novel depth-normal consistency loss, and a global refinement step.

Findings

01

Achieves state-of-the-art performance on KITTI dataset

02

Effectively balances local PatchMatch optimization with global consistency

03

Outperforms classical MVS algorithms and monocular depth networks

Abstract

In this paper, a complete pipeline for image-based 3D reconstruction of urban scenarios is proposed, based on PatchMatch Multi-View Stereo (MVS). Input images are firstly fed into an off-the-shelf visual SLAM system to extract camera poses and sparse keypoints, which are used to initialize PatchMatch optimization. Then, pixelwise depths and normals are iteratively computed in a multi-scale framework with a novel depth-normal consistency loss term and a global refinement algorithm to balance the inherently local nature of PatchMatch. Finally, a large-scale point cloud is generated by back-projecting multi-view consistent estimates in 3D. The proposed approach is carefully evaluated against both classical MVS algorithms and monocular depth networks on the KITTI dataset, showing state of the art performances.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Remote Sensing and LiDAR Applications · Robotics and Sensor-Based Localization