MMGenBench: Fully Automatically Evaluating LMMs from the Text-to-Image Generation Perspective
Hailang Huang, Yong Wang, Zixuan Huang, Huaqiu Li, Tongwen Huang,, Xiangxiang Chu, Richong Zhang

TL;DR
This paper introduces MMGenBench, an automated evaluation pipeline for large multimodal models that assesses their image understanding and generation capabilities through a novel text-to-image comparison approach.
Contribution
The authors develop MMGenBench-Pipeline and MMGenBench-Test, enabling fully automated, domain-diverse evaluation of LMMs' image understanding and generation performance.
Findings
Many top-performing LMMs in existing benchmarks underperform in image understanding tasks.
The pipeline effectively evaluates LMMs across 13 image patterns and multiple domains.
Results reveal significant room for improvement in current LMMs' image description abilities.
Abstract
Large Multimodal Models (LMMs) demonstrate impressive capabilities. However, current benchmarks predominantly focus on image comprehension in specific domains, and these benchmarks are labor-intensive to construct. Moreover, their answers tend to be brief, making it difficult to assess the ability of LMMs to generate detailed descriptions of images. To address these limitations, we propose the MMGenBench-Pipeline, a straightforward and fully automated evaluation pipeline. This involves generating textual descriptions from input images, using these descriptions to create auxiliary images via text-to-image generative models, and then comparing the original and generated images. Furthermore, to ensure the effectiveness of MMGenBench-Pipeline, we design MMGenBench-Test, evaluating LMMs across 13 distinct image patterns, and MMGenBench-Domain, focusing on generative image performance. A…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Mathematics, Computing, and Information Processing
MethodsFocus
