Goals, Process, and Challenges of Exploratory Data Analysis: An Interview Study
Kanit Wongsuphasawat, Yang Liu, Jeffrey Heer

TL;DR
This study investigates how analysis goals and context influence exploratory data analysis (EDA) through interviews with data analysts, revealing common goals, challenges, and opportunities for tool improvements.
Contribution
It provides empirical insights into the goals, processes, and challenges of EDA, highlighting the distinction between profiling and discovery and suggesting design opportunities for tools.
Findings
Profiling is a constant goal across analyses.
Discovery occurs mainly in open-ended analyses.
Analysts face repetitive tasks and limited time or domain knowledge.
Abstract
How do analysis goals and context affect exploratory data analysis (EDA)? To investigate this question, we conducted semi-structured interviews with 18 data analysts. We characterize common exploration goals: profiling (assessing data quality) and discovery (gaining new insights). Though the EDA literature primarily emphasizes discovery, we observe that discovery only reliably occurs in the context of open-ended analyses, whereas all participants engage in profiling across all of their analyses. We describe the process and challenges of EDA highlighted by our interviews. We find that analysts must perform repetitive tasks (e.g., examine numerous variables), yet they may have limited time or lack domain knowledge to explore data. Analysts also often have to consult other stakeholders and oscillate between exploration and other tasks, such as acquiring and wrangling additional data. Based…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Time Series Analysis and Forecasting · Data Stream Mining Techniques
