Deep 3D Pan via adaptive "t-shaped" convolutions with global and local adaptive dilations
Juan Luis Gonzalez Bello, Munchurl Kim

TL;DR
This paper introduces a novel deep learning architecture called monster-net that uses adaptive 't-shaped' kernels with global and local dilations to synthesize stereoscopic views from a single image, enabling 3D visualization and improving depth estimation accuracy.
Contribution
The paper presents a new network architecture with 't-shaped' adaptive kernels and dilations for view synthesis and depth estimation, outperforming existing methods on multiple datasets.
Findings
Significantly outperforms state-of-the-art in RMSE, PSNR, SSIM metrics.
Produces more reliable image structures with coherent geometry.
Extracted disparity information is more accurate for monocular depth estimation.
Abstract
Recent advances in deep learning have shown promising results in many low-level vision tasks. However, solving the single-image-based view synthesis is still an open problem. In particular, the generation of new images at parallel camera views given a single input image is of great interest, as it enables 3D visualization of the 2D input scenery. We propose a novel network architecture to perform stereoscopic view synthesis at arbitrary camera positions along the X-axis, or Deep 3D Pan, with "t-shaped" adaptive kernels equipped with globally and locally adaptive dilations. Our proposed network architecture, the monster-net, is devised with a novel "t-shaped" adaptive kernel with globally and locally adaptive dilation, which can efficiently incorporate global camera shift into and handle local 3D geometries of the target image's pixels for the synthesis of naturally looking 3D panned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Computer Graphics and Visualization Techniques
