Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search

Wentao Shi; Zichun Yu; Fuli Feng; Xiangnan He; and Chenyan Xiong

arXiv:2502.00955·cs.CL·April 27, 2026

Efficient Multi-Agent System Training with Data Influence-Oriented Tree Search

Wentao Shi, Zichun Yu, Fuli Feng, Xiangnan He, and Chenyan Xiong

PDF

TL;DR

This paper introduces DITS, a new framework that uses influence scores instead of Q-values in Monte Carlo Tree Search to select impactful data, improving multi-agent system training efficiency and effectiveness.

Contribution

The paper proposes influence score-based data selection within MCTS for multi-agent systems, reducing computational costs and better aligning data synthesis with training objectives.

Findings

01

Influence scores outperform Q-values in identifying impactful data.

02

Allocating inference resources to influence scores improves training efficiency.

03

Experiments on eight datasets validate the robustness of the proposed method.

Abstract

Monte Carlo Tree Search (MCTS) based methods provide promising approaches for generating synthetic data to enhance the self-training of Large Language Model (LLM) based multi-agent systems (MAS). These methods leverage Q-values to estimate individual agent contributions. However, relying solely on Q-values to identify informative data may misalign with the data synthesis objective, as the focus should be on selecting data that best enhances model training. To address this discrepancy, we propose Data Influence-oriented Tree Search (DITS), a novel framework that incorporates influence scores to guide both tree search and data selection. By leveraging influence scores, we effectively identify the most impactful data for system improvement, thereby enhancing model performance. Furthermore, we derive influence score estimation methods tailored for non-differentiable metrics, significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.