WorldGenBench: A World-Knowledge-Integrated Benchmark for Reasoning-Driven Text-to-Image Generation
Daoan Zhang, Che Jiang, Ruoshi Xu, Biaoxiang Chen, Zijian Jin, Yutian, Lu, Jianguo Zhang, Liang Yong, Jiebo Luo, Shengda Luo

TL;DR
WorldGenBench is a new benchmark for evaluating text-to-image models' ability to incorporate world knowledge and reasoning, revealing current strengths and gaps in state-of-the-art systems.
Contribution
Introduces WorldGenBench, a comprehensive benchmark with a Knowledge Checklist Score to assess reasoning and knowledge grounding in T2I models.
Findings
Diffusion models outperform other open-source methods.
Proprietary models like GPT-4o show stronger reasoning capabilities.
Current models still lack deep understanding and inference abilities.
Abstract
Recent advances in text-to-image (T2I) generation have achieved impressive results, yet existing models still struggle with prompts that require rich world knowledge and implicit reasoning: both of which are critical for producing semantically accurate, coherent, and contextually appropriate images in real-world scenarios. To address this gap, we introduce \textbf{WorldGenBench}, a benchmark designed to systematically evaluate T2I models' world knowledge grounding and implicit inferential capabilities, covering both the humanities and nature domains. We propose the \textbf{Knowledge Checklist Score}, a structured metric that measures how well generated images satisfy key semantic expectations. Experiments across 21 state-of-the-art models reveal that while diffusion models lead among open-source methods, proprietary auto-regressive models like GPT-4o exhibit significantly stronger…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Natural Language Processing Techniques · Topic Modeling
MethodsDiffusion
