CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification

Jinpeng Chen; Cheng Gong; Hanbo Li; Ziru Liu; Zichen Tian; Xinyu Fu; Shi Wu; Chenyang Zhang; Wu Zhang; Suiyun Zhang; Dandan Tu; Rui Liu

arXiv:2603.01940·cs.AI·March 3, 2026

CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification

Jinpeng Chen, Cheng Gong, Hanbo Li, Ziru Liu, Zichen Tian, Xinyu Fu, Shi Wu, Chenyang Zhang, Wu Zhang, Suiyun Zhang, Dandan Tu, Rui Liu

PDF

Open Access 1 Models 2 Datasets

TL;DR

CoVe introduces a constraint-guided verification framework for training interactive tool-use agents, enabling the synthesis of high-quality data and improving success rates in complex multi-turn tasks.

Contribution

The paper presents CoVe, a novel post-training data synthesis method that uses explicit task constraints for generating and verifying training trajectories for tool-use agents.

Findings

01

CoVe-4B achieves 43.0% success in Airline domain.

02

CoVe-4B achieves 59.4% success in Retail domain.

03

Outperforms similar-scale baselines and rivals larger models.

Abstract

Developing multi-turn interactive tool-use agents is challenging because real-world user needs are often complex and ambiguous, yet agents must execute deterministic actions to satisfy them. To address this gap, we introduce \textbf{CoVe} (\textbf{Co}nstraint-\textbf{Ve}rification), a post-training data synthesis framework designed for training interactive tool-use agents while ensuring both data complexity and correctness. CoVe begins by defining explicit task constraints, which serve a dual role: they guide the generation of complex trajectories and act as deterministic verifiers for assessing trajectory quality. This enables the creation of high-quality training trajectories for supervised fine-tuning (SFT) and the derivation of accurate reward signals for reinforcement learning (RL). Our evaluation on the challenging $τ^{2}$ -bench benchmark demonstrates the effectiveness of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
Zichen1024/CoVe-4B
model· 100 dl· ♡ 5
100 dl♡ 5

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety