ImageBART: Bidirectional Context with Multinomial Diffusion for Autoregressive Image Synthesis
Patrick Esser, Robin Rombach, Andreas Blattmann, Bj\"orn, Ommer

TL;DR
ImageBART introduces a hierarchical autoregressive model that combines multinomial diffusion with a coarse-to-fine approach, enabling more global context understanding and improved image editing capabilities.
Contribution
It presents a novel combination of autoregressive modeling with multinomial diffusion to incorporate global context in image synthesis and editing.
Findings
Enhanced image modification capabilities over traditional autoregressive models
High-fidelity image generation in a compressed latent space
Effective local image editing and inpainting without mask-specific training
Abstract
Autoregressive models and their sequential factorization of the data likelihood have recently demonstrated great potential for image representation and synthesis. Nevertheless, they incorporate image context in a linear 1D order by attending only to previously synthesized image patches above or to the left. Not only is this unidirectional, sequential bias of attention unnatural for images as it disregards large parts of a scene until synthesis is almost complete. It also processes the entire image on a single scale, thus ignoring more global contextual information up to the gist of the entire scene. As a remedy we incorporate a coarse-to-fine hierarchy of context by combining the autoregressive formulation with a multinomial diffusion process: Whereas a multistage diffusion process successively removes information to coarsen an image, we train a (short) Markov chain to invert this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Image Processing Techniques · Computer Graphics and Visualization Techniques
MethodsDiffusion · Inpainting
