Deep 3D Pan via adaptive "t-shaped" convolutions with global and local   adaptive dilations

Juan Luis Gonzalez Bello; Munchurl Kim

arXiv:1910.01089·eess.SP·October 22, 2019

Deep 3D Pan via adaptive "t-shaped" convolutions with global and local adaptive dilations

Juan Luis Gonzalez Bello, Munchurl Kim

PDF

Open Access

TL;DR

This paper introduces a novel deep learning architecture called monster-net that uses adaptive 't-shaped' kernels with global and local dilations to synthesize stereoscopic views from a single image, enabling 3D visualization and improving depth estimation accuracy.

Contribution

The paper presents a new network architecture with 't-shaped' adaptive kernels and dilations for view synthesis and depth estimation, outperforming existing methods on multiple datasets.

Findings

01

Significantly outperforms state-of-the-art in RMSE, PSNR, SSIM metrics.

02

Produces more reliable image structures with coherent geometry.

03

Extracted disparity information is more accurate for monocular depth estimation.

Abstract

Recent advances in deep learning have shown promising results in many low-level vision tasks. However, solving the single-image-based view synthesis is still an open problem. In particular, the generation of new images at parallel camera views given a single input image is of great interest, as it enables 3D visualization of the 2D input scenery. We propose a novel network architecture to perform stereoscopic view synthesis at arbitrary camera positions along the X-axis, or Deep 3D Pan, with "t-shaped" adaptive kernels equipped with globally and locally adaptive dilations. Our proposed network architecture, the monster-net, is devised with a novel "t-shaped" adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift into and handle local 3D geometries of the target image's pixels for the synthesis of naturally looking 3D panned…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Computer Graphics and Visualization Techniques