VDSAgents: A PCS-Guided Multi-Agent System for Veridical Data Science Automation

Yunxuan Jiang (School of Management; Xi'an Jiaotong University); Silan Hu (School of Computing; National University of Singapore); Xiaoning Wang (School of Data Science; Media Intelligence; Communication University of China); Yuanyuan Zhang (Beijing Baixingkefu Network Technology Co.; Ltd.); and Xiangyu Chang (School of Management; Xi'an Jiaotong University)

arXiv:2510.24339·cs.AI·October 30, 2025

VDSAgents: A PCS-Guided Multi-Agent System for Veridical Data Science Automation

Yunxuan Jiang (School of Management, Xi'an Jiaotong University), Silan Hu (School of Computing, National University of Singapore), Xiaoning Wang (School of Data Science, Media Intelligence, Communication University of China)

PDF

TL;DR

VDSAgents introduces a PCS-guided multi-agent system that enhances trustworthiness and robustness in data science automation by integrating scientific principles into LLM-driven workflows, outperforming existing systems on diverse datasets.

Contribution

This work presents VDSAgents, a novel multi-agent system grounded in PCS principles, for structured and scientifically auditable data science automation.

Findings

01

VDSAgents outperforms AutoKaggle and DataInterpreter on nine datasets.

02

Embedding PCS principles improves system robustness and trustworthiness.

03

The modular workflow enhances scientific auditability and system reliability.

Abstract

Large language models (LLMs) become increasingly integrated into data science workflows for automated system design. However, these LLM-driven data science systems rely solely on the internal reasoning of LLMs, lacking guidance from scientific and theoretical principles. This limits their trustworthiness and robustness, especially when dealing with noisy and complex real-world datasets. This paper provides VDSAgents, a multi-agent system grounded in the Predictability-Computability-Stability (PCS) principles proposed in the Veridical Data Science (VDS) framework. Guided by PCS principles, the system implements a modular workflow for data cleaning, feature engineering, modeling, and evaluation. Each phase is handled by an elegant agent, incorporating perturbation analysis, unit testing, and model validation to ensure both functionality and scientific auditability. We evaluate VDSAgents…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.