IA-T2I: Internet-Augmented Text-to-Image Generation
Chuanhao Li, Jianwen Sun, Yukang Feng, Mingliang Zhai, Yifan Chang, Kaipeng Zhang

TL;DR
This paper introduces IA-T2I, a framework that enhances text-to-image generation by incorporating internet-sourced reference images and self-reflection to better handle uncertain knowledge in prompts.
Contribution
The paper proposes a novel internet-augmented T2I framework with active retrieval, hierarchical image selection, and self-reflection mechanisms for improved uncertain knowledge handling.
Findings
Outperforms GPT-4o by about 30% in human evaluation.
Effectively handles uncertain, rare, unknown, and ambiguous knowledge in prompts.
Introduces the Img-Ref-T2I dataset for evaluation.
Abstract
Current text-to-image (T2I) generation models achieve promising results, but they fail on the scenarios where the knowledge implied in the text prompt is uncertain. For example, a T2I model released in February would struggle to generate a suitable poster for a movie premiering in April, because the character designs and styles are uncertain to the model. To solve this problem, we propose an Internet-Augmented text-to-image generation (IA-T2I) framework to compel T2I models clear about such uncertain knowledge by providing them with reference images. Specifically, an active retrieval module is designed to determine whether a reference image is needed based on the given text prompt; a hierarchical image selection module is introduced to find the most suitable image returned by an image search engine to enhance the T2I model; a self-reflection mechanism is presented to continuously…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVideo Analysis and Summarization · Mathematics, Computing, and Information Processing · Computational Physics and Python Applications
