TL;DR
LagerNVS introduces a 3D-aware neural network for real-time, high-quality novel view synthesis that leverages pre-trained 3D features and can be combined with generative models.
Contribution
It proposes LagerNVS, a neural architecture that incorporates 3D inductive biases via pre-trained 3D features for improved NVS performance.
Findings
Achieves 31.4 PSNR on Re10k dataset.
Operates in real time with and without known camera parameters.
Generalizes well to in-the-wild data and supports generative extrapolation.
Abstract
Recent work has shown that neural networks can perform 3D tasks such as Novel View Synthesis (NVS) without explicit 3D reconstruction. Even so, we argue that strong 3D inductive biases are still helpful in the design of such networks. We show this point by introducing LagerNVS, an encoder-decoder neural network for NVS that builds on `3D-aware' latent features. The encoder is initialized from a 3D reconstruction network pre-trained using explicit 3D supervision. This is paired with a lightweight decoder, and trained end-to-end with photometric losses. LagerNVS achieves state-of-the-art deterministic feed-forward Novel View Synthesis (including 31.4 PSNR on Re10k), with and without known cameras, renders in real time, generalizes to in-the-wild data, and can be paired with a diffusion decoder for generative extrapolation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
