# Monocular Neural Image Based Rendering with Continuous View Control

**Authors:** Xu Chen, Jie Song, Otmar Hilliges

arXiv: 1901.01880 · 2019-09-10

## TL;DR

This paper introduces a self-supervised neural network architecture for high-quality, controllable view synthesis of 3D scenes from monocular images, leveraging geometric constraints and depth-guided warping.

## Contribution

It proposes a novel network combining a transforming auto-encoder with depth-guided warping, enabling accurate view synthesis without explicit supervision, and generalizes well to unseen images.

## Key findings

- Achieves high-quality view synthesis with fine-grained control.
- Operates effectively without depth or flow supervision.
- Generalizes to unseen images like internet product photos.

## Abstract

We present an approach that learns to synthesize high-quality, novel views of 3D objects or scenes, while providing fine-grained and precise control over the 6-DOF viewpoint. The approach is self-supervised and only requires 2D images and associated view transforms for training. Our main contribution is a network architecture that leverages a transforming auto-encoder in combination with a depth-guided warping procedure to predict geometrically accurate unseen views. Leveraging geometric constraints renders direct supervision via depth or flow maps unnecessary. If large parts of the object are occluded in the source view, a purely learning based prior is used to predict the values for dis-occluded pixels. Our network furthermore predicts a per-pixel mask, used to fuse depth-guided and pixel-based predictions. The resulting images reflect the desired 6-DOF transformation and details are preserved. We thoroughly evaluate our architecture on synthetic and real scenes and under fine-grained and fixed-view settings. Finally, we demonstrate that the approach generalizes to entirely unseen images such as product images downloaded from the internet.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.01880/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1901.01880/full.md

## References

65 references — full list in the complete paper: https://tomesphere.com/paper/1901.01880/full.md

---
Source: https://tomesphere.com/paper/1901.01880