Interactive Visual Assessment for Text-to-Image Generation Models
Xiaoyue Mi, Fan Tang, Juan Cao, Qiang Sheng, Ziyao Huang, Peng Li,, Yang Liu, Tong-Yee Lee

TL;DR
DyEval is an interactive, LLM-powered framework that enhances the evaluation of text-to-image models by enabling dynamic, collaborative, and interpretable assessment, uncovering complex failure patterns more effectively.
Contribution
The paper introduces DyEval, a novel interactive assessment framework that adaptively probes models and provides interpretability, surpassing traditional static evaluation methods.
Findings
Identifies up to 2.56 times more failures than conventional methods.
Uncovers complex failure patterns like pronoun and cultural context issues.
Demonstrates effectiveness through qualitative and quantitative experiments.
Abstract
Visual generation models have achieved remarkable progress in computer graphics applications but still face significant challenges in real-world deployment. Current assessment approaches for visual generation tasks typically follow an isolated three-phase framework: test input collection, model output generation, and user assessment. These fashions suffer from fixed coverage, evolving difficulty, and data leakage risks, limiting their effectiveness in comprehensively evaluating increasingly complex generation models. To address these limitations, we propose DyEval, an LLM-powered dynamic interactive visual assessment framework that facilitates collaborative evaluation between humans and generative models for text-to-image systems. DyEval features an intuitive visual interface that enables users to interactively explore and analyze model behaviors, while adaptively generating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Augmented Reality Applications · Multimedia Communication and Technology
