Geometry-biased Transformers for Novel View Synthesis

Naveen Venkat; Mayank Agarwal; Maneesh Singh; Shubham Tulsiani

arXiv:2301.04650·cs.CV·January 12, 2023

Geometry-biased Transformers for Novel View Synthesis

Naveen Venkat, Mayank Agarwal, Maneesh Singh, Shubham Tulsiani

PDF

Open Access

TL;DR

This paper introduces Geometry-biased Transformers that incorporate geometric information into multi-view image synthesis, significantly improving the accuracy of novel view generation by enforcing geometric consistency.

Contribution

The paper proposes a novel Transformer architecture with geometric inductive biases, enhancing multi-view consistency in novel view synthesis from limited input images.

Findings

01

Significant improvement over prior methods on CO3D dataset

02

Enhanced geometric consistency in generated views

03

Effective use of 3D distance-based attention bias

Abstract

We tackle the task of synthesizing novel views of an object given a few input images and associated camera viewpoints. Our work is inspired by recent 'geometry-free' approaches where multi-view images are encoded as a (global) set-latent representation, which is then used to predict the color for arbitrary query rays. While this representation yields (coarsely) accurate images corresponding to novel viewpoints, the lack of geometric reasoning limits the quality of these outputs. To overcome this limitation, we propose 'Geometry-biased Transformers' (GBTs) that incorporate geometric inductive biases in the set-latent representation-based inference to encourage multi-view geometric consistency. We induce the geometric bias by augmenting the dot-product attention mechanism to also incorporate 3D distances between rays associated with tokens as a learnable bias. We find that this, along…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Computer Graphics and Visualization Techniques · Generative Adversarial Networks and Image Synthesis