OrchDAG: Complex Tool Orchestration in Multi-Turn Interactions with Plan DAGs

Yifu Lu; Shengjie Liu; Li Dong

arXiv:2510.24663·cs.AI·October 29, 2025

OrchDAG: Complex Tool Orchestration in Multi-Turn Interactions with Plan DAGs

Yifu Lu, Shengjie Liu, Li Dong

PDF

TL;DR

OrchDAG introduces a synthetic dataset modeling complex multi-turn tool interactions as DAGs, providing a challenging benchmark and a graph-based reward to improve reinforcement learning in agentic tool use scenarios.

Contribution

The paper presents OrchDAG, a novel synthetic dataset and reward mechanism that enhance modeling and training of complex multi-turn tool interactions using DAGs.

Findings

01

The dataset is challenging but solvable.

02

Graph-based reward improves RL training.

03

Leveraging topological structure enhances performance.

Abstract

Agentic tool use has gained traction with the rise of agentic tool calling, yet most existing work overlooks the complexity of multi-turn tool interactions. We introduce OrchDAG, a synthetic data generation pipeline that models tool execution as directed acyclic graphs (DAGs) with controllable complexity. Using this dataset, we benchmark model performance and propose a graph-based reward to enhance RLVR training. Experiments show that the dataset presents a challenging but solvable benchmark, and the proposed reward is effective when combined with GRPO-style algorithms, highlighting the importance of leveraging topological structure and data complexity in multi-turn tool use.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.