AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations

Minjun Zhu; Zhen Lin; Yixuan Weng; Panzhong Lu; Qiujie Xie; Yifan Wei; Sifan Liu; Qiyao Sun; Yue Zhang

arXiv:2602.03828·cs.AI·February 13, 2026

AutoFigure: Generating and Refining Publication-Ready Scientific Illustrations

Minjun Zhu, Zhen Lin, Yixuan Weng, Panzhong Lu, Qiujie Xie, Yifan Wei, Sifan Liu, Qiyao Sun, Yue Zhang

PDF

Open Access 2 Datasets 3 Reviews

TL;DR

AutoFigure is a novel framework that automatically generates high-quality scientific illustrations from long-form texts, significantly improving the efficiency and quality of scientific communication.

Contribution

It introduces AutoFigure, the first agentic system for automatic scientific illustration generation, and presents FigureBench, a large-scale benchmark dataset for text-to-illustration tasks.

Findings

01

AutoFigure outperforms baseline methods in quality and aesthetics.

02

The dataset contains 3,300 high-quality scientific text-figure pairs.

03

AutoFigure produces publication-ready scientific illustrations.

Abstract

High-quality scientific illustrations are crucial for effectively communicating complex scientific and technical concepts, yet their manual creation remains a well-recognized bottleneck in both academia and industry. We present FigureBench, the first large-scale benchmark for generating scientific illustrations from long-form scientific texts. It contains 3,300 high-quality scientific text-figure pairs, covering diverse text-to-illustration tasks from scientific papers, surveys, blogs, and textbooks. Moreover, we propose AutoFigure, the first agentic framework that automatically generates high-quality scientific illustrations based on long-form scientific text. Specifically, before rendering the final result, AutoFigure engages in extensive thinking, recombination, and validation to produce a layout that is both structurally sound and aesthetically refined, outputting a scientific…

Peer Reviews

Decision·ICLR 2026 Poster

Reviewer 01Rating 4Confidence 4

Strengths

1. The paper introduces a modern benchmark for scientific illustration generation that includes diverse long text to figure pairs from papers, surveys, blogs and textbooks. The authors also provide dataset statistics and analysis showing the challenge of long context reasoning. 2. The method uses an agent based pipeline that first grounds concepts with a VLM to produce a symbolic layout, then performs iterative refinement to improve structure, and finally renders the figure. This decoupled desi

Weaknesses

1. The comparison to prior datasets is incomplete. Paper2Fig100k [1] dataset is not mentioned or cited, and Paper2Fig100k already contains more than 100k text to figure pairs. The claim that FigureBench is the first large scale benchmark is therefore not correct, and should be reframed more precisely. 2. There is no reference to recent TiKZ based diagram generation approaches such as Automatikz [2], which are directly relevant to the diagram synthesis space. 4. The design of the VLM as judge

Reviewer 02Rating 4Confidence 3

Strengths

The paper addresses a very impactful task related to figure generation, which is important in many different industries and research domains. The authors perform a good and comprehensive human and automatic evaluation with several strong baselines. The paper is well-written and explains the problem in a clear and complete way. The authors include a detailed human evaluation setup, which is a good way to test whether the AutoFigure generation is actually good. They recruited 10 human experts t

Weaknesses

No open-source models were tested (it only mentions Gemini, Grok, Claude, and GPT). It is important to include methods that work well with open-source LLMs so the research can be reproduced with minimal cost and used more broadly. A lot of methodology details are missing. For example, when the paper says "identifying key entities and their relationships, and distilling a core methodology summary," it is not explained how this is actually done as no system prompt nor user prompt are shown. Usin

Reviewer 03Rating 2Confidence 4

Strengths

- This paper represents an important early step toward exploring how AI can assist humans in the time-consuming process of scientific illustration creation. The topic is interesting and promising, with significant potential for advancing AI-assisted scientific communication. - AutoFigure is designed as a three-stage framework, where each stage addresses distinct challenges in the illustration generation process. These stages work in synergy, resembling how humans iteratively refine scientific f

Weaknesses

- In Figure 1, AutoFigure generates a scientific illustration for InstructGPT. However, the original InstructGPT paper does not mention any examples related to relativity, suggesting that AutoFigure may have extended the content beyond the source text. Moreover, there is an error in the generated example (“ravity” instead of “gravity”), indicating that in some cases, the framework may pay more attention to aesthetic appeal than scientific accuracy. - As mentioned in the paper, scientific illust

Code & Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Data Visualization and Analytics · Generative Adversarial Networks and Image Synthesis