Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer   with Mixture-of-View-Experts

Wenyan Cong; Hanxue Liang; Peihao Wang; Zhiwen Fan; Tianlong Chen,; Mukund Varma; Yi Wang; Zhangyang Wang

arXiv:2308.11793·cs.CV·August 24, 2023

Enhancing NeRF akin to Enhancing LLMs: Generalizable NeRF Transformer with Mixture-of-View-Experts

Wenyan Cong, Hanxue Liang, Peihao Wang, Zhiwen Fan, Tianlong Chen,, Mukund Varma, Yi Wang, Zhangyang Wang

PDF

Open Access 1 Repo

TL;DR

This paper introduces GNT-MOVE, a novel generalizable NeRF model enhanced with Mixture-of-Experts, achieving state-of-the-art cross-scene view synthesis by combining transformer-based architecture with expert specialization.

Contribution

It integrates Mixture-of-Experts into a NeRF transformer architecture, improving generalization to unseen scenes with a shared expert and geometry-aware loss.

Findings

01

Achieves state-of-the-art results on unseen scenes

02

Demonstrates superior zero-shot and few-shot generalization

03

Outperforms previous generalizable NeRF models

Abstract

Cross-scene generalizable NeRF models, which can directly synthesize novel views of unseen scenes, have become a new spotlight of the NeRF field. Several existing attempts rely on increasingly end-to-end "neuralized" architectures, i.e., replacing scene representation and/or rendering modules with performant neural networks such as transformers, and turning novel view synthesis into a feed-forward inference pipeline. While those feedforward "neuralized" architectures still do not fit diverse scenes well out of the box, we propose to bridge them with the powerful Mixture-of-Experts (MoE) idea from large language models (LLMs), which has demonstrated superior generalization ability by balancing between larger overall model capacity and flexible per-instance specialization. Starting from a recent generalizable NeRF architecture called GNT, we first demonstrate that MoE can be neatly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vita-group/gnt-move
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Generative Adversarial Networks and Image Synthesis