A Lightweight Neural Network for Monocular View Generation with Occlusion Handling
Simon Evain, Christine Guillemot

TL;DR
This paper introduces a lightweight neural network that synthesizes novel views from a single image using disparity estimation and occlusion handling, outperforming state-of-the-art methods on KITTI with significantly fewer parameters.
Contribution
A novel, efficient neural network architecture for monocular view synthesis that incorporates occlusion handling and disparity consistency, reducing model size while maintaining high performance.
Findings
Outperforms state-of-the-art on KITTI dataset.
Reduces model parameters by 5-10 times.
Generates accurate depth and confidence maps.
Abstract
In this article, we present a very lightweight neural network architecture, trained on stereo data pairs, which performs view synthesis from one single image. With the growing success of multi-view formats, this problem is indeed increasingly relevant. The network returns a prediction built from disparity estimation, which fills in wrongly predicted regions using a occlusion handling technique. To do so, during training, the network learns to estimate the left-right consistency structural constraint on the pair of stereo input images, to be able to replicate it at test time from one single image. The method is built upon the idea of blending two predictions: a prediction based on disparity estimation, and a prediction based on direct minimization in occluded regions. The network is also able to identify these occluded areas at training and at test time by checking the pixelwise…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
