View Synthesis of Dynamic Scenes based on Deep 3D Mask Volume
Kai-En Lin, Guowei Yang, Lei Xiao, Feng Liu, Ravi, Ramamoorthi

TL;DR
This paper introduces a new dataset and a deep learning method for synthesizing stable, photorealistic views of dynamic scenes from multiple camera videos, addressing challenges of temporal consistency and scene complexity.
Contribution
The paper presents a novel Deep 3D Mask Volume algorithm and a high-quality multi-view video dataset for dynamic scene view synthesis.
Findings
Achieves improved temporal stability over existing methods.
Produces view synthesis videos with minimal flickering artifacts.
Enables larger translational movements in synthesized views.
Abstract
Image view synthesis has seen great success in reconstructing photorealistic visuals, thanks to deep learning and various novel representations. The next key step in immersive virtual experiences is view synthesis of dynamic scenes. However, several challenges exist due to the lack of high-quality training datasets, and the additional time dimension for videos of dynamic scenes. To address this issue, we introduce a multi-view video dataset, captured with a custom 10-camera rig in 120FPS. The dataset contains 96 high-quality scenes showing various visual effects and human interactions in outdoor scenes. We develop a new algorithm, Deep 3D Mask Volume, which enables temporally-stable view extrapolation from binocular videos of dynamic scenes, captured by static cameras. Our algorithm addresses the temporal inconsistency of disocclusions by identifying the error-prone areas with a 3D mask…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Image and Video Stabilization
