FigureQA: An Annotated Figure Dataset for Visual Reasoning

Samira Ebrahimi Kahou; Vincent Michalski; Adam Atkinson; Akos Kadar,; Adam Trischler; Yoshua Bengio

arXiv:1710.07300·cs.CV·February 26, 2018·123 cites

FigureQA: An Annotated Figure Dataset for Visual Reasoning

Samira Ebrahimi Kahou, Vincent Michalski, Adam Atkinson, Akos Kadar,, Adam Trischler, Yoshua Bengio

PDF

Open Access 1 Repo 5 Datasets

TL;DR

FigureQA is a large synthetic dataset of scientific figures with question-answer pairs designed to advance visual reasoning in machine learning, including diverse figure types and auxiliary data for training.

Contribution

The paper introduces a comprehensive dataset for visual reasoning on scientific figures, including annotations and auxiliary data, facilitating research in pattern recognition from visual data.

Findings

01

Relation Network performs well as a baseline

02

The task is challenging for current models

03

Auxiliary data aids in training models

Abstract

We introduce FigureQA, a visual reasoning corpus of over one million question-answer pairs grounded in over 100,000 images. The images are synthetic, scientific-style figures from five classes: line plots, dot-line plots, vertical and horizontal bar graphs, and pie charts. We formulate our reasoning task by generating questions from 15 templates; questions concern various relationships between plot elements and examine characteristics like the maximum, the minimum, area-under-the-curve, smoothness, and intersection. To resolve, such questions often require reference to multiple plot elements and synthesis of information distributed spatially throughout a figure. To facilitate the training of machine learning systems, the corpus also includes side data that can be used to formulate auxiliary objectives. In particular, we provide the numerical data used to generate each figure as well as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vmichals/FigureQA-baseline
tfOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques