Guided Statistical Workflows with Interactive Explanations and Assumption Checking
Yuqi Zhang, Adam Perer, Will Epperson

TL;DR
GuidedStats is an interactive tool integrated into computational notebooks that guides users through statistical analysis workflows, providing explanations, visualizations, and assumption checks to improve validity and usability.
Contribution
It introduces an interactive, step-by-step workflow system that combines guidance, visualization, and assumption checking for statistical analyses within notebooks.
Findings
Helps users identify assumption violations in data.
Facilitates iterative and guided statistical analysis.
Supports exporting results for further coding.
Abstract
Statistical practices such as building regression models or running hypothesis tests rely on following rigorous procedures of steps and verifying assumptions on data to produce valid results. However, common statistical tools do not verify users' decision choices and provide low-level statistical functions without instructions on the whole analysis practice. Users can easily misuse analysis methods, potentially decreasing the validity of results. To address this problem, we introduce GuidedStats, an interactive interface within computational notebooks that encapsulates guidance, models, visualization, and exportable results into interactive workflows. It breaks down typical analysis processes, such as linear regression and two-sample T-tests, into interactive steps supplemented with automatic visualizations and explanations for step-wise evaluation. Users can iterate on input choices to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Data Mining Algorithms and Applications
