SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model

Yifan Chang; Yukang Feng; Jianwen Sun; Jiaxin Ai; Chuanhao Li; S. Kevin Zhou; Kaipeng Zhang

arXiv:2505.22126·cs.CV·May 29, 2025

SridBench: Benchmark of Scientific Research Illustration Drawing of Image Generation Model

Yifan Chang, Yukang Feng, Jianwen Sun, Jiaxin Ai, Chuanhao Li, S. Kevin Zhou, Kaipeng Zhang

PDF

Open Access

TL;DR

SridBench is a new benchmark designed to evaluate AI models' ability to generate accurate and clear scientific illustrations, revealing current models still lag behind human performance in this complex task.

Contribution

This paper introduces SridBench, the first comprehensive benchmark for scientific figure generation, addressing a critical gap in evaluating AI's capabilities in technical visual creation.

Findings

01

Top models like GPT-4o-image underperform humans in scientific figure generation.

02

Models struggle with text clarity, visual quality, and scientific accuracy.

03

Benchmark highlights the need for more advanced reasoning in AI visual generation.

Abstract

Recent years have seen rapid advances in AI-driven image generation. Early diffusion models emphasized perceptual quality, while newer multimodal models like GPT-4o-image integrate high-level reasoning, improving semantic understanding and structural composition. Scientific illustration generation exemplifies this evolution: unlike general image synthesis, it demands accurate interpretation of technical content and transformation of abstract ideas into clear, standardized visuals. This task is significantly more knowledge-intensive and laborious, often requiring hours of manual work and specialized tools. Automating it in a controllable, intelligent manner would provide substantial practical value. Yet, no benchmark currently exists to evaluate AI on this front. To fill this gap, we introduce SridBench, the first benchmark for scientific figure generation. It comprises 1,120 instances…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Data Visualization and Analytics