Taming Stable Diffusion for Text to 360{\deg} Panorama Image Generation
Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella, Xiaoshui Huang, Dinh, Phung, Wanli Ouyang, Jianfei Cai

TL;DR
This paper introduces PanFusion, a dual-branch diffusion model that generates 360-degree panorama images from text prompts, addressing data scarcity and domain gap issues in panorama image synthesis.
Contribution
The paper presents a novel dual-branch diffusion model with a cross-attention mechanism for improved text-to-panorama generation, integrating prior knowledge and constraints.
Findings
Outperforms existing methods in panorama generation quality
Effectively incorporates additional constraints like room layout
Demonstrates robustness in diverse text prompts
Abstract
Generative models, e.g., Stable Diffusion, have enabled the creation of photorealistic images from text prompts. Yet, the generation of 360-degree panorama images from text remains a challenge, particularly due to the dearth of paired text-panorama data and the domain gap between panorama and perspective images. In this paper, we introduce a novel dual-branch diffusion model named PanFusion to generate a 360-degree image from a text prompt. We leverage the stable diffusion model as one branch to provide prior knowledge in natural image generation and register it to another panorama branch for holistic image generation. We propose a unique cross-attention mechanism with projection awareness to minimize distortion during the collaborative denoising process. Our experiments validate that PanFusion surpasses existing methods and, thanks to its dual-branch structure, can integrate additional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
MethodsDiffusion
