TechImage-Bench: Rubric-Based Evaluation for Technical Image Generation
Minheng Ni, Zhengyuan Yang, Yaowen Zhang, Linjie Li, Chung-Ching Lin, Kevin Lin, Zhendong Wang, Xiaofei Wang, Shujie Liu, Lei Zhang, Wangmeng Zuo, Lijuan Wang

TL;DR
This paper introduces TechImage-Bench, a comprehensive benchmark with detailed rubrics for evaluating the scientific accuracy of technical image generation models, highlighting current gaps and potential for iterative improvement.
Contribution
It presents a novel rubric-based benchmark and evaluation framework for scientific image generation, enabling detailed assessment and iterative refinement of models.
Findings
Best model achieves only 0.801 rubric accuracy
Significant gaps in scientific fidelity of generated images
Iterative refinement improves scores substantially
Abstract
We study technical image generation, where a model must synthesize information-dense, scientifically precise illustrations from detailed descriptions rather than merely produce visually plausible pictures. To quantify the progress, we introduce TechImage-Bench, a rubric-based benchmark that targets biology schematics, engineering/patent drawings, and general technical illustrations. For 654 figures collected from real textbooks and technical reports, we construct detailed image instructions and a hierarchy of rubrics that decompose correctness into 6,076 criteria and 44,131 binary checks. Rubrics are derived from surrounding text and reference figures using large multimodal models, and are evaluated by an automated LMM-based judge with a principled penalty scheme that aggregates sub-question outcomes into interpretable criterion scores. We benchmark several representative text-to-image…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Cell Image Analysis Techniques · Generative Adversarial Networks and Image Synthesis
