UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation

Yibin Wang; Zhimin Li; Yuhang Zang; Jiazi Bu; Yujie Zhou; Yi Xin; Junjun He; Chunyu Wang; Qinglin Lu; Cheng Jin; Jiaqi Wang

arXiv:2510.18701·cs.CV·February 25, 2026

UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation

Yibin Wang, Zhimin Li, Yuhang Zang, Jiazi Bu, Yujie Zhou, Yi Xin, Junjun He, Chunyu Wang, Qinglin Lu, Cheng Jin, Jiaqi Wang

PDF

Open Access 2 Models 2 Datasets

TL;DR

UniGenBench++ is a comprehensive, multilingual benchmark with diverse prompts and fine-grained evaluation criteria for assessing the semantic accuracy of text-to-image models.

Contribution

It introduces a hierarchical, multilingual benchmark with detailed evaluation dimensions and a robust assessment pipeline for T2I models.

Findings

01

Revealed strengths and weaknesses of various T2I models

02

Provided a scalable, fine-grained evaluation framework

03

Enhanced assessment reliability with multilingual prompts

Abstract

Recent progress in text-to-image (T2I) generation underscores the importance of reliable benchmarks in evaluating how accurately generated images reflect the semantics of their textual prompt. However, (1) existing benchmarks lack the diversity of prompt scenarios and multilingual support, both essential for real-world applicability; (2) they offer only coarse evaluations across primary dimensions, covering a narrow range of sub-dimensions, and fall short in fine-grained sub-dimension assessment. To address these limitations, we introduce UniGenBench++, a unified semantic assessment benchmark for T2I generation. Specifically, it comprises 600 prompts organized hierarchically to ensure both coverage and efficiency: (1) spans across diverse real-world scenarios, i.e., 5 main prompt themes and 20 subthemes; (2) comprehensively probes T2I models' semantic consistency over 10 primary and 27…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Digital Humanities and Scholarship