Re-Imagen: Retrieval-Augmented Text-to-Image Generator
Wenhu Chen, Hexiang Hu, Chitwan Saharia, William W. Cohen

TL;DR
Re-Imagen is a novel retrieval-augmented text-to-image generator that leverages external knowledge bases to produce more accurate and high-fidelity images of rare or unseen entities, outperforming existing models.
Contribution
The paper introduces Re-Imagen, a new model that integrates retrieval mechanisms with generative models to improve image generation for uncommon entities, along with a new benchmark for evaluation.
Findings
Re-Imagen significantly improves FID scores over COCO and WikiImage datasets.
The model enhances image fidelity for rare entities compared to prior methods.
Human evaluations confirm better accuracy and realism in generated images.
Abstract
Research on text-to-image generation has witnessed significant progress in generating diverse and photo-realistic images, driven by diffusion and auto-regressive models trained on large-scale image-text data. Though state-of-the-art models can generate high-quality images of common entities, they often have difficulty generating images of uncommon entities, such as `Chortai (dog)' or `Picarones (food)'. To tackle this issue, we present the Retrieval-Augmented Text-to-Image Generator (Re-Imagen), a generative model that uses retrieved information to produce high-fidelity and faithful images, even for rare or unseen entities. Given a text prompt, Re-Imagen accesses an external multi-modal knowledge base to retrieve relevant (image, text) pairs and uses them as references to generate the image. With this retrieval step, Re-Imagen is augmented with the knowledge of high-level semantics and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques
MethodsDiffusion · Balanced Selection
