ITI-GEN: Inclusive Text-to-Image Generation
Cheng Zhang, Xuanbai Chen, Siqi Chai, Chen Henry Wu and, Dmitry Lagun, Thabo Beeler, Fernando De la Torre

TL;DR
ITI-GEN introduces a novel method for inclusive text-to-image generation that uses reference images and learned prompt embeddings to better represent diverse attributes without fine-tuning, improving inclusivity in generated images.
Contribution
The paper proposes ITI-GEN, a reference-image-based approach that enhances inclusivity in text-to-image models without requiring model fine-tuning.
Findings
Significantly improves diversity of generated images across attributes.
Effectively uses reference images to represent hard-to-describe categories.
Achieves state-of-the-art performance in inclusive image generation.
Abstract
Text-to-image generative models often reflect the biases of the training data, leading to unequal representations of underrepresented groups. This study investigates inclusive text-to-image generative models that generate images based on human-written prompts and ensure the resulting images are uniformly distributed across attributes of interest. Unfortunately, directly expressing the desired attributes in the prompt often leads to sub-optimal results due to linguistic ambiguity or model misrepresentation. Hence, this paper proposes a drastically different approach that adheres to the maxim that "a picture is worth a thousand words". We show that, for some attributes, images can represent concepts more expressively than text. For instance, categories of skin tones are typically hard to specify by text but can be easily represented by example images. Building upon these insights, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Computational and Text Analysis Methods · Generative Adversarial Networks and Image Synthesis
