Scientific Image Synthesis: Benchmarking, Methodologies, and Downstream Utility
Honglin Lin, Chonghan Qin, Zheng Liu, Qizhi Pei, Yu Li, Zhanping Zhong, Xin Gao, Yanfeng Wang, Conghui He, Lijun Wu

TL;DR
This paper systematically evaluates scientific image synthesis methods, introduces a logic-driven framework and benchmark, and demonstrates that fine-tuning multimodal models on verified images enhances scientific reasoning capabilities.
Contribution
It presents ImgCoder, a new structured synthesis framework, and SciGenBench, a benchmark for scientific image correctness, advancing the fidelity and utility of synthetic scientific images.
Findings
Pixel-based models have systematic failure modes.
There is a fundamental trade-off between expressiveness and precision.
Fine-tuning LMMs on verified images improves reasoning performance.
Abstract
While synthetic data has proven effective for improving scientific reasoning in the text domain, multimodal reasoning remains constrained by the difficulty of synthesizing scientifically rigorous images. Existing Text-to-Image (T2I) models often produce outputs that are visually plausible yet scientifically incorrect, resulting in a persistent visual-logic divergence that limits their value for downstream reasoning. Motivated by recent advances in next-generation T2I models, we conduct a systematic study of scientific image synthesis across generation paradigms, evaluation, and downstream use. We analyze both direct pixel-based generation and programmatic synthesis, and propose ImgCoder, a logic-driven framework that follows an explicit "understand - plan - code" workflow to improve structural precision. To rigorously assess scientific correctness, we introduce SciGenBench, which…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · Data Visualization and Analytics
