CoVe: Training Interactive Tool-Use Agents via Constraint-Guided Verification
Jinpeng Chen, Cheng Gong, Hanbo Li, Ziru Liu, Zichen Tian, Xinyu Fu, Shi Wu, Chenyang Zhang, Wu Zhang, Suiyun Zhang, Dandan Tu, Rui Liu

TL;DR
CoVe introduces a constraint-guided verification framework for training interactive tool-use agents, enabling the synthesis of high-quality data and improving success rates in complex multi-turn tasks.
Contribution
The paper presents CoVe, a novel post-training data synthesis method that uses explicit task constraints for generating and verifying training trajectories for tool-use agents.
Findings
CoVe-4B achieves 43.0% success in Airline domain.
CoVe-4B achieves 59.4% success in Retail domain.
Outperforms similar-scale baselines and rivals larger models.
Abstract
Developing multi-turn interactive tool-use agents is challenging because real-world user needs are often complex and ambiguous, yet agents must execute deterministic actions to satisfy them. To address this gap, we introduce \textbf{CoVe} (\textbf{Co}nstraint-\textbf{Ve}rification), a post-training data synthesis framework designed for training interactive tool-use agents while ensuring both data complexity and correctness. CoVe begins by defining explicit task constraints, which serve a dual role: they guide the generation of complex trajectories and act as deterministic verifiers for assessing trajectory quality. This enables the creation of high-quality training trajectories for supervised fine-tuning (SFT) and the derivation of accurate reward signals for reinforcement learning (RL). Our evaluation on the challenging -bench benchmark demonstrates the effectiveness of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Reinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety
