Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via   Establishing Pseudo-Stereo Supervision

Zisong Chen; Chunyu Lin; Lang Nie; Kang Liao; Yao Zhao

arXiv:2302.09922·cs.CV·February 23, 2023

Unsupervised OmniMVS: Efficient Omnidirectional Depth Inference via Establishing Pseudo-Stereo Supervision

Zisong Chen, Chunyu Lin, Lang Nie, Kang Liao, Yao Zhao

PDF

Open Access

TL;DR

This paper introduces Unsupervised OmniMVS, a novel framework for omnidirectional multi-view stereo that uses pseudo-stereo supervision from fisheye images, enabling efficient 3D depth inference without dense labels.

Contribution

It presents the first unsupervised omnidirectional MVS method with a new pseudo-stereo supervision approach and an efficient network architecture with frequency attention and a light cost volume.

Findings

01

Competitive performance to supervised methods

02

Better generalization in real-world data

03

Efficient inference with novel components

Abstract

Omnidirectional multi-view stereo (MVS) vision is attractive for its ultra-wide field-of-view (FoV), enabling machines to perceive 360{\deg} 3D surroundings. However, the existing solutions require expensive dense depth labels for supervision, making them impractical in real-world applications. In this paper, we propose the first unsupervised omnidirectional MVS framework based on multiple fisheye images. To this end, we project all images to a virtual view center and composite two panoramic images with spherical geometry from two pairs of back-to-back fisheye images. The two 360{\deg} images formulate a stereo pair with a special pose, and the photometric consistency is leveraged to establish the unsupervised constraint, which we term "Pseudo-Stereo Supervision". In addition, we propose Un-OmniMVS, an efficient unsupervised omnidirectional MVS network, to facilitate the inference speed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings