ImageRAGTurbo: Towards One-step Text-to-Image Generation with Retrieval-Augmented Diffusion Models
Peijie Qiu, Hariharan Ramshankar, Arnau Ramisa, Ren\'e Vidal, Amit Kumar K C, Vamsi Salaka, Rahul Bhagat

TL;DR
ImageRAGTurbo introduces a retrieval-augmented diffusion approach that enables one-step text-to-image generation with high fidelity, reducing latency without sacrificing quality by leveraging relevant retrieved examples.
Contribution
The paper proposes a retrieval-augmented finetuning method for diffusion models, enhancing one-step generation quality without extensive retraining.
Findings
High-fidelity images generated in one step
Retrieval augmentation improves prompt alignment
Efficient blending of retrieved content enhances quality
Abstract
Diffusion models have emerged as the leading approach for text-to-image generation. However, their iterative sampling process, which gradually morphs random noise into coherent images, introduces significant latency that limits their applicability. While recent few-step diffusion models reduce the number of sampling steps to as few as one to four steps, they often compromise image quality and prompt alignment, especially in one-step generation. Additionally, these models require computationally expensive training procedures. To address these limitations, we propose ImageRAGTurbo, a novel approach to efficiently finetune few-step diffusion models via retrieval augmentation. Given a text prompt, we retrieve relevant text-image pairs from a database and use them to condition the generation process. We argue that such retrieved examples provide rich contextual information to the UNet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Digital Humanities and Scholarship · Computer Graphics and Visualization Techniques
