SciGA: A Comprehensive Dataset for Designing Graphical Abstracts in Academic Papers

Takuro Kawada; Shunsuke Kitada; Sota Nemoto; Hitoshi Iyatomi

arXiv:2507.02212·cs.CV·April 7, 2026

SciGA: A Comprehensive Dataset for Designing Graphical Abstracts in Academic Papers

Takuro Kawada, Shunsuke Kitada, Sota Nemoto, Hitoshi Iyatomi

PDF

TL;DR

This paper introduces SciGA-145k, a large dataset of scientific papers and figures, to support automated graphical abstract design and recommendation tasks, along with a new evaluation metric called CAR.

Contribution

The paper presents a new large-scale dataset and two novel recommendation tasks for graphical abstracts, along with a new metric for model evaluation in this domain.

Findings

01

Benchmark results validate the proposed tasks and metric.

02

The dataset supports research in automated GA generation.

03

CAR provides a more nuanced analysis of model behavior.

Abstract

Graphical Abstracts (GAs) play a crucial role in visually conveying the key findings of scientific papers. Although recent research increasingly incorporates visual materials such as Figure 1 as de facto GAs, their potential to enhance scientific communication remains largely unexplored. Designing effective GAs requires advanced visualization skills, hindering their widespread adoption. To tackle these challenges, we introduce SciGA-145k, a large-scale dataset comprising approximately 145,000 scientific papers and 1.14 million figures, specifically designed to support GA selection and recommendation, and to facilitate research in automated GA generation. As a preliminary step toward GA design support, we define two tasks: 1) Intra-GA Recommendation, identifying figures within a given paper well-suited as GAs, and 2) Inter-GA Recommendation, retrieving GAs from other papers to inspire…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.