VisAnatomy: An SVG Chart Corpus with Fine-Grained Semantic Labels
Chen Chen, Hannah K. Bako, Peihong Yu, John Hooker, Jeffrey Joyal, Simon C. Wang, Samuel Kim, Jessica Wu, Aoxue Ding, Lara Sandeep, Alex Chen, Chayanika Sinha, Zhicheng Liu

TL;DR
VisAnatomy is a comprehensive SVG chart corpus with detailed semantic labels, enabling advanced visualization research and AI applications through fine-grained annotations of over 383,000 graphical elements.
Contribution
The paper introduces VISANATOMY, a large-scale SVG chart corpus with multi-level semantic labels, surpassing existing datasets in detail and diversity for visualization research.
Findings
Demonstrated the corpus's utility in four visualization tasks.
Compared VISANATOMY's richness with existing corpora.
Showcased applications in semantic inference and chart classification.
Abstract
Chart corpora, which comprise data visualizations and their semantic labels, are crucial for advancing visualization research. However, the labels in most existing corpora are high-level (e.g., chart types), hindering their utility for broader applications in the era of AI. In this paper, we contribute VISANATOMY, a corpus containing 942 real-world SVG charts produced by over 50 tools, encompassing 40 chart types and featuring structural and stylistic design variations. Each chart is augmented with multi-level fine-grained labels on its semantic components, including each graphical element's type, role, and position, hierarchical groupings of elements, group layouts, and visual encodings. In total, VISANATOMY provides labels for more than 383k graphical elements. We demonstrate the richness of the semantic labels by comparing VISANATOMY with existing corpora. We illustrate its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
