Peeking Behind Objects: Layered Depth Prediction from a Single Image
Helisa Dhamo, Keisuke Tateno, Iro Laina, Nassir Navab, Federico, Tombari

TL;DR
This paper introduces a CNN-based method to predict layered depth images from a single RGB image, enabling occluded scene regions to be inferred for improved scene synthesis in AR/VR applications.
Contribution
It presents a novel approach combining depth and foreground mask prediction with GANs to hallucinate occluded scene parts from a single image.
Findings
Effective scene view synthesis from a single image
Improved occlusion handling in depth prediction
Enhanced AR/VR scene exploration capabilities
Abstract
While conventional depth estimation can infer the geometry of a scene from a single RGB image, it fails to estimate scene regions that are occluded by foreground objects. This limits the use of depth prediction in augmented and virtual reality applications, that aim at scene exploration by synthesizing the scene from a different vantage point, or at diminished reality. To address this issue, we shift the focus from conventional depth map prediction to the regression of a specific data representation called Layered Depth Image (LDI), which contains information about the occluded regions in the reference frame and can fill in occlusion gaps in case of small view changes. We propose a novel approach based on Convolutional Neural Networks (CNNs) to jointly predict depth maps and foreground separation masks used to condition Generative Adversarial Networks (GANs) for hallucinating plausible…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
