EigenData: A Self-Evolving Multi-Agent Platform for Function-Calling Data Synthesis, Auditing, and Repair
Jiaao Chen, Jingyuan Qi, Mingye Gao, Wei-Chen Wang, Hanrui Wang, Di Jin

TL;DR
EigenData is a comprehensive, self-evolving platform that automates data generation, auditing, and repair for function-calling agents, significantly improving benchmark quality and model evaluation accuracy.
Contribution
The paper introduces EigenData, a novel multi-agent system that automates the entire data lifecycle for function-calling models, including auditing and repair, with a focus on systematic error correction.
Findings
Successfully identified and corrected systematic errors in the BFCL-V3 benchmark.
Repaired benchmark data led to model rankings more aligned with human judgments.
Demonstrated improved evaluation metrics based on outcome-aware protocols.
Abstract
Function-calling agents -- large language models that invoke tools and APIs -- require high-quality, domain-specific training data spanning executable environments, backing databases, and diverse multi-turn trajectories. We introduce EigenData, an integrated, self-evolving platform that automates the full data lifecycle through a multi-agent architecture. A top-level orchestrator, EigenCore, coordinates three specialized sub-systems: DatabaseAgent for realistic domain database construction, CodingAgent for verified executable environment generation with iterative test-debug loops, and DataAgent for multi-turn trajectory synthesis with self-evolving prompt optimization. Cross-component feedback ensures consistency across all artifacts. We apply EigenData to audit and repair the Berkeley Function-Calling Leaderboard (BFCL-V3), identifying systematic errors in function schemas,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Natural Language Processing Techniques · Software Engineering Research
