VDSAgents: A PCS-Guided Multi-Agent System for Veridical Data Science Automation
Yunxuan Jiang (School of Management, Xi'an Jiaotong University), Silan Hu (School of Computing, National University of Singapore), Xiaoning Wang (School of Data Science, Media Intelligence, Communication University of China)

TL;DR
VDSAgents introduces a PCS-guided multi-agent system that enhances trustworthiness and robustness in data science automation by integrating scientific principles into LLM-driven workflows, outperforming existing systems on diverse datasets.
Contribution
This work presents VDSAgents, a novel multi-agent system grounded in PCS principles, for structured and scientifically auditable data science automation.
Findings
VDSAgents outperforms AutoKaggle and DataInterpreter on nine datasets.
Embedding PCS principles improves system robustness and trustworthiness.
The modular workflow enhances scientific auditability and system reliability.
Abstract
Large language models (LLMs) become increasingly integrated into data science workflows for automated system design. However, these LLM-driven data science systems rely solely on the internal reasoning of LLMs, lacking guidance from scientific and theoretical principles. This limits their trustworthiness and robustness, especially when dealing with noisy and complex real-world datasets. This paper provides VDSAgents, a multi-agent system grounded in the Predictability-Computability-Stability (PCS) principles proposed in the Veridical Data Science (VDS) framework. Guided by PCS principles, the system implements a modular workflow for data cleaning, feature engineering, modeling, and evaluation. Each phase is handled by an elegant agent, incorporating perturbation analysis, unit testing, and model validation to ensure both functionality and scientific auditability. We evaluate VDSAgents…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
