SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized   Images

Risa Shinoda; Kuniaki Saito; Shohei Tanaka; Tosho Hirasawa; Yoshitaka; Ushiku

arXiv:2412.17606·cs.CV·December 24, 2024

SBS Figures: Pre-training Figure QA from Stage-by-Stage Synthesized Images

Risa Shinoda, Kuniaki Saito, Shohei Tanaka, Tosho Hirasawa, Yoshitaka, Ushiku

PDF

Open Access 1 Repo 4 Models 1 Datasets

TL;DR

SBSFigures is a new dataset created through a stage-by-stage pipeline that enables efficient pre-training of figure QA models with fully annotated, diverse synthetic figures, reducing manual effort and improving training effectiveness.

Contribution

The paper introduces SBSFigures, a novel synthetic figure dataset generated without manual annotation, enhancing pre-training for figure question answering tasks.

Findings

01

Pre-training with SBSFigures improves QA performance on real-world data.

02

The pipeline reduces code errors and increases diversity in synthetic figures.

03

Efficient training with limited real data is achievable using the pre-trained weights.

Abstract

Building a large-scale figure QA dataset requires a considerable amount of work, from gathering and selecting figures to extracting attributes like text, numbers, and colors, and generating QAs. Although recent developments in LLMs have led to efforts to synthesize figures, most of these focus primarily on QA generation. Additionally, creating figures directly using LLMs often encounters issues such as code errors, similar-looking figures, and repetitive content in figures. To address this issue, we present SBSFigures (Stage-by-Stage Synthetic Figures), a dataset for pre-training figure QA. Our proposed pipeline enables the creation of chart figures with complete annotations of the visualized data and dense QA annotations without any manual annotation process. Our stage-by-stage pipeline makes it possible to create diverse topic and appearance figures efficiently while minimizing code…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

omron-sinicx/sbsfigures
pytorchOfficial

Models

Datasets

omron-sinicx/sbsfigures
dataset· 296 dl
296 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction

MethodsFocus