Empirical Insights into Analytic Provenance Summarization: A Study on Segmenting Data Analysis Workflows
Shaghayegh Esmaeili, Irelis D. Suarez, Ezekiel Ajayi, Eric D. Ragan

TL;DR
This paper investigates how humans naturally segment and summarize visual data analysis workflows, providing empirical insights to inform the development of automated provenance-summarization algorithms for better interpretability.
Contribution
It offers a detailed empirical study of user behavior in data analysis segmentation, informing the design of more effective automated summarization tools.
Findings
Identifies key patterns in how users segment workflows
Reveals the influence of data-driven actions and strategic thinking
Provides a foundation for algorithm development based on human behavior
Abstract
The complexity of exploratory data analysis poses significant challenges for collaboration and effective communication of analytic workflows. Automated methods can alleviate these challenges by summarizing workflows into more interpretable segments, but designing effective provenance-summarization algorithms depends on understanding the factors that guide how humans segment their analysis. To address this, we conducted an empirical study that explores how users naturally present, communicate, and summarize visual data analysis activities. Our qualitative analysis uncovers key patterns and high-level categories that inform users' decisions when segmenting analytic workflows, revealing the nuanced interplay between data-driven actions and strategic thinking. These insights provide a robust empirical foundation for algorithm development and highlight critical factors that must be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Quality and Management · Research Data Management Practices
