CAVA: A Visual Analytics System for Exploratory Columnar Data Augmentation Using Knowledge Graphs
Dylan Cashman, Shenyu Xu, Subhajit Das, Florian Heimerl, Cong Liu,, Shah Rukh Humayoun, Michael Gleicher, Alex Endert, Remco Chang

TL;DR
CAVA is a visual analytics system that integrates data curation and augmentation via knowledge graphs, enabling in-situ data foraging and complex attribute construction during analysis to improve insights.
Contribution
The paper introduces CAVA, a system that combines visual analytics with knowledge graph-based data augmentation, allowing iterative, in-situ data foraging during analysis.
Findings
CAVA enables users to perform complex data combinations without programming.
The system improves analysis outcomes through effective in-situ data foraging.
User study confirms CAVA's effectiveness in real-world scenarios.
Abstract
Most visual analytics systems assume that all foraging for data happens before the analytics process; once analysis begins, the set of data attributes considered is fixed. Such separation of data construction from analysis precludes iteration that can enable foraging informed by the needs that arise in-situ during the analysis. The separation of the foraging loop from the data analysis tasks can limit the pace and scope of analysis. In this paper, we present CAVA, a system that integrates data curation and data augmentation with the traditional data exploration and analysis tasks, enabling information foraging in-situ during analysis. Identifying attributes to add to the dataset is difficult because it requires human knowledge to determine which available attributes will be helpful for the ensuing analytical tasks. CAVA crawls knowledge graphs to provide users with a a broad set of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Semantic Web and Ontologies · Scientific Computing and Data Management
