TL;DR
Varif.ai is a novel system that enables user-driven diversity in scalable image generation by iteratively generating, verifying, and varying images based on user-specified attributes, enhancing creativity and fairness.
Contribution
It introduces a new framework combining text-to-image and large language models for controlled, iterative diversity in image generation, addressing user-specific diversity goals.
Findings
Varif.ai improves diversity control over baseline methods.
User study confirms ease of achieving diverse image sets.
Controlled evaluation shows higher effectiveness in diverse scenarios.
Abstract
Diversity in image generation is essential to ensure fair representations and support creativity in ideation. Hence, many text-to-image models have implemented diversification mechanisms. Yet, after a few iterations of generation, a lack of diversity becomes apparent, because each user has their own diversity goals (e.g., different colors, brands of cars), and there are diverse attributions to be specified. To support user-driven diversity control, we propose Varif.ai that employs text-to-image and Large Language Models to iteratively i) (re)generate a set of images, ii) verify if user-specified attributes have sufficient coverage, and iii) vary existing or new attributes. Through an elicitation study, we uncovered user needs for diversity in image generation. A pilot validation showed that Varif.ai made achieving diverse image sets easier. In a controlled evaluation with 20…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
