Autoregressive Omni-Aware Outpainting for Open-Vocabulary 360-Degree Image Generation
Zhuqiang Lu, Kun Hu, Chaoyue Wang, Lei Bai, Zhiyong Wang

TL;DR
This paper introduces AOG-Net, an autoregressive omni-aware model for 360-degree image outpainting guided by NFoV images and text, enabling detailed, flexible, and open-vocabulary 360-degree scene synthesis.
Contribution
The paper proposes a novel autoregressive omni-aware generative network that combines multi-modal guidance and global-local conditioning for high-quality 360-degree image outpainting.
Findings
Achieves state-of-the-art results on indoor and outdoor datasets.
Supports flexible editing with text and visual guidance during generation.
Demonstrates effective open-vocabulary scene synthesis.
Abstract
A 360-degree (omni-directional) image provides an all-encompassing spherical view of a scene. Recently, there has been an increasing interest in synthesising 360-degree images from conventional narrow field of view (NFoV) images captured by digital cameras and smartphones, for providing immersive experiences in various scenarios such as virtual reality. Yet, existing methods typically fall short in synthesizing intricate visual details or ensure the generated images align consistently with user-provided prompts. In this study, autoregressive omni-aware generative network (AOG-Net) is proposed for 360-degree image generation by out-painting an incomplete 360-degree image progressively with NFoV and text guidances joinly or individually. This autoregressive scheme not only allows for deriving finer-grained and text-consistent patterns by dynamically generating and adjusting the process…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
MethodsALIGN
