End-to-end View Synthesis via NeRF Attention
Zelin Zhao, Jiaya Jia

TL;DR
This paper introduces NeRF Attention (NeRFA), a novel transformer-based approach for view synthesis that incorporates volumetric rendering principles and multi-stage attention to improve efficiency and quality over previous methods.
Contribution
NeRFA integrates NeRF-inspired feature modulation into transformers and employs multi-stage attention, achieving state-of-the-art results in view synthesis tasks.
Findings
NeRFA outperforms NeRF and NerFormer on four datasets.
NeRFA achieves new state-of-the-art in single-scene and category-centric view synthesis.
NeRFA effectively models volumetric rendering within a transformer framework.
Abstract
In this paper, we present a simple seq2seq formulation for view synthesis where we take a set of ray points as input and output colors corresponding to the rays. Directly applying a standard transformer on this seq2seq formulation has two limitations. First, the standard attention cannot successfully fit the volumetric rendering procedure, and therefore high-frequency components are missing in the synthesized views. Second, applying global attention to all rays and pixels is extremely inefficient. Inspired by the neural radiance field (NeRF), we propose the NeRF attention (NeRFA) to address the above problems. On the one hand, NeRFA considers the volumetric rendering equation as a soft feature modulation procedure. In this way, the feature modulation enhances the transformers with the NeRF-like inductive bias. On the other hand, NeRFA performs multi-stage attention to reduce the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · 3D Shape Modeling and Analysis
MethodsSigmoid Activation · Tanh Activation · RoIAlign · RoIPool · Softmax · Long Short-Term Memory · Sequence to Sequence
