VisImages: A Fine-Grained Expert-Annotated Visualization Dataset
Dazhen Deng, Yihong Wu, Xinhuan Shu, Jiang Wu, Siwei Fu, Weiwei Cui,, Yingcai Wu

TL;DR
VisImages is a comprehensive, expert-annotated dataset of over 12,000 visualization images from research papers, enabling analysis, classification, and localization tasks to advance visualization research.
Contribution
The paper introduces VisImages, a large, expert-annotated dataset of visualization images with a comprehensive taxonomy, supporting various automated visualization analysis tasks.
Findings
VisImages enables visualization classification benchmarking.
The dataset supports visualization localization in systems.
Use cases demonstrate its utility in research and automation.
Abstract
Images in visualization publications contain rich information, e.g., novel visualization designs and implicit design patterns of visualizations. A systematic collection of these images can contribute to the community in many aspects, such as literature analysis and automated tasks for visualization. In this paper, we build and make public a dataset, VisImages, which collects 12,267 images with captions from 1,397 papers in IEEE InfoVis and VAST. Built upon a comprehensive visualization taxonomy, the dataset includes 35,096 visualizations and their bounding boxes in the images.We demonstrate the usefulness of VisImages through three use cases: 1) investigating the use of visualizations in the publications with VisImages Explorer, 2) training and benchmarking models for visualization classification, and 3) localizing visualizations in the visual analytics systems automatically.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
