PCS Workflow for Veridical Data Science in the Age of AI
Zachary T. Rewolinski, Bin Yu

TL;DR
This paper introduces an updated PCS workflow for truthful data science that emphasizes addressing uncertainty throughout the data science life cycle, incorporating generative AI tools, and demonstrating its application via examples and case studies.
Contribution
It presents a streamlined PCS framework tailored for practitioners, integrating generative AI guidance and illustrating its effectiveness through practical examples and case studies.
Findings
PCS workflow effectively captures uncertainty in data science processes.
Generative AI enhances decision-making within the PCS framework.
Case study shows impact of judgment calls on prediction uncertainty.
Abstract
Data science is a pillar of artificial intelligence (AI), which is transforming nearly every domain of human activity, from the social and physical sciences to engineering and medicine. While data-driven findings in AI offer unprecedented power to extract insights and guide decision-making, many are difficult or impossible to replicate. A key reason for this challenge is the uncertainty introduced by the many choices made throughout the data science life cycle (DSLC). Traditional statistical frameworks often fail to account for this uncertainty. The Predictability-Computability-Stability (PCS) framework for veridical (truthful) data science offers a principled approach to addressing this challenge throughout the DSLC. This paper presents an updated and streamlined PCS workflow, tailored for practitioners and enhanced with guided use of generative AI. We include a running example to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
