FWD: Real-time Novel View Synthesis with Forward Warping and Depth
Ang Cao, Chris Rockwell, Justin Johnson

TL;DR
This paper introduces FWD, a real-time, generalizable novel view synthesis method that leverages explicit depth and differentiable rendering to produce high-quality images from sparse inputs with significant speed improvements.
Contribution
FWD is the first to combine explicit depth, differentiable rendering, and real-time performance for generalizable NVS with sparse views, outperforming prior methods in speed and quality.
Findings
Achieves 130-1000x speedup over state-of-the-art methods.
Produces high-quality, photorealistic novel views from sparse inputs.
Seamlessly integrates sensor depth to enhance image quality.
Abstract
Novel view synthesis (NVS) is a challenging task requiring systems to generate photorealistic images of scenes from new viewpoints, where both quality and speed are important for applications. Previous image-based rendering (IBR) methods are fast, but have poor quality when input views are sparse. Recent Neural Radiance Fields (NeRF) and generalizable variants give impressive results but are not real-time. In our paper, we propose a generalizable NVS method with sparse inputs, called FWD, which gives high-quality synthesis in real-time. With explicit depth and differentiable rendering, it achieves competitive results to the SOTA methods with 130-1000x speedup and better perceptual quality. If available, we can seamlessly integrate sensor depth during either training or inference to improve image quality while retaining real-time speed. With the growing prevalence of depths sensors, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Advanced Image Processing Techniques
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
