R2I-Bench: Benchmarking Reasoning-Driven Text-to-Image Generation
Kaijie Chen, Zihao Lin, Zhiyang Xu, Ying Shen, Yuguang Yao, Joy Rimchala, Jiaxin Zhang, Lifu Huang

TL;DR
R2I-Bench is a new benchmark designed to evaluate reasoning capabilities in text-to-image generation models, revealing current models' limitations and guiding future improvements in reasoning-aware architectures.
Contribution
The paper introduces R2I-Bench, a comprehensive dataset and evaluation metric specifically targeting reasoning in text-to-image generation, filling a critical gap in current assessment methods.
Findings
Current models show limited reasoning performance.
Decoupled reasoning and generation frameworks still struggle with reasoning tasks.
The benchmark highlights the need for more reasoning-aware T2I architectures.
Abstract
Reasoning is a fundamental capability often required in real-world text-to-image (T2I) generation, e.g., generating ``a bitten apple that has been left in the air for more than a week`` necessitates understanding temporal decay and commonsense concepts. While recent T2I models have made impressive progress in producing photorealistic images, their reasoning capability remains underdeveloped and insufficiently evaluated. To bridge this gap, we introduce R2I-Bench, a comprehensive benchmark specifically designed to rigorously assess reasoning-driven T2I generation. R2I-Bench comprises meticulously curated data instances, spanning core reasoning categories, including commonsense, mathematical, logical, compositional, numerical, causal, and concept mixing. To facilitate fine-grained evaluation, we design R2IScore, a QA-style metric based on instance-specific, reasoning-oriented evaluation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Data Visualization and Analytics
