Multi-view Image Diffusion via Coordinate Noise and Fourier Attention
Justin Theiss, Norman M\"uller, Daeil Kim, Aayush Prakash

TL;DR
This paper introduces a novel diffusion-based method with Fourier attention, shared noise initialization, and cross-attention loss to generate multi-view consistent images from prompts, significantly improving state-of-the-art results.
Contribution
The paper presents a new multi-view image diffusion approach using Fourier-based attention, a novel noise initialization, and cross-attention loss for enhanced multi-view consistency.
Findings
Achieves state-of-the-art performance on multi-view consistency metrics.
Produces qualitatively better multi-view images compared to existing methods.
Demonstrates improved alignment of features across different views.
Abstract
Recently, text-to-image generation with diffusion models has made significant advancements in both higher fidelity and generalization capabilities compared to previous baselines. However, generating holistic multi-view consistent images from prompts still remains an important and challenging task. To address this challenge, we propose a diffusion process that attends to time-dependent spatial frequencies of features with a novel attention mechanism as well as novel noise initialization technique and cross-attention loss. This Fourier-based attention block focuses on features from non-overlapping regions of the generated scene in order to better align the global appearance. Our noise initialization technique incorporates shared noise and low spatial frequency information derived from pixel coordinates and depth maps to induce noise correlations across views. The cross-attention loss…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Signal Denoising Methods · Advanced Image Processing Techniques · Advanced Image Fusion Techniques
MethodsSoftmax · Attention Is All You Need · Diffusion · ALIGN
