3D scene reconstruction from monocular spherical video with motion   parallax

Kenji Tanaka

arXiv:2206.06533·cs.CV·June 15, 2022

3D scene reconstruction from monocular spherical video with motion parallax

Kenji Tanaka

PDF

Open Access

TL;DR

This paper presents a method to extract nearly complete 360-degree depth information from monocular spherical videos with motion parallax, enabling detailed 3D scene reconstruction from simple, widely available footage.

Contribution

The authors introduce a novel monocular spherical stereo technique that retrieves comprehensive depth data from standard 360 videos with motion parallax, without requiring specialized equipment.

Findings

01

Depth retrieval covers up to 97% of the sphere in solid angle.

02

Objects over 30 meters away can be estimated at 30 km/h.

03

Reconstructed 3D structures are clearly observable.

Abstract

In this paper, we describe a method to capture nearly entirely spherical (360 degree) depth information using two adjacent frames from a single spherical video with motion parallax. After illustrating a spherical depth information retrieval using two spherical cameras, we demonstrate monocular spherical stereo by using stabilized first-person video footage. Experiments demonstrated that the depth information was retrieved on up to 97% of the entire sphere in solid angle. At a speed of 30 km/h, we were able to estimate the depth of an object located over 30 m from the camera. We also reconstructed the 3D structures (point cloud) using the obtained depth data and confirmed the structures can be clearly observed. We can apply this method to 3D structure retrieval of surrounding environments such as 1) previsualization, location hunting/planning of a film, 2) real scene/computer graphics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image and Video Stabilization · Robotics and Sensor-Based Localization

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings