APIGen: Automated Pipeline for Generating Verifiable and Diverse   Function-Calling Datasets

Zuxin Liu; Thai Hoang; Jianguo Zhang; Ming Zhu; Tian Lan; Shirley; Kokane; Juntao Tan; Weiran Yao; Zhiwei Liu; Yihao Feng; Rithesh Murthy,; Liangwei Yang; Silvio Savarese; Juan Carlos Niebles; Huan Wang; Shelby; Heinecke; Caiming Xiong

arXiv:2406.18518·cs.CL·June 27, 2024·2 cites

APIGen: Automated Pipeline for Generating Verifiable and Diverse Function-Calling Datasets

Zuxin Liu, Thai Hoang, Jianguo Zhang, Ming Zhu, Tian Lan, Shirley, Kokane, Juntao Tan, Weiran Yao, Zhiwei Liu, Yihao Feng, Rithesh Murthy,, Liangwei Yang, Silvio Savarese, Juan Carlos Niebles, Huan Wang, Shelby, Heinecke, Caiming Xiong

PDF

Open Access 10 Models 5 Datasets

TL;DR

APIGen is an automated pipeline that synthesizes high-quality, verifiable function-calling datasets from a large collection of APIs, significantly improving model performance in function-calling tasks.

Contribution

The paper introduces APIGen, a scalable and structured data generation pipeline that creates diverse, verified datasets for function-calling models, enhancing their reliability and performance.

Findings

01

Models trained on APIGen datasets outperform GPT-4 on benchmarks.

02

A 7B parameter model achieves state-of-the-art results.

03

The dataset contains 60,000 high-quality entries.

Abstract

The advancement of function-calling agent models requires diverse, reliable, and high-quality datasets. This paper presents APIGen, an automated data generation pipeline designed to synthesize verifiable high-quality datasets for function-calling applications. We leverage APIGen and collect 3,673 executable APIs across 21 different categories to generate diverse function-calling datasets in a scalable and structured manner. Each data in our dataset is verified through three hierarchical stages: format checking, actual function executions, and semantic verification, ensuring its reliability and correctness. We demonstrate that models trained with our curated datasets, even with only 7B parameters, can achieve state-of-the-art performance on the Berkeley Function-Calling Benchmark, outperforming multiple GPT-4 models. Moreover, our 1B model achieves exceptional performance, surpassing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Time Series Analysis and Forecasting

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Absolute Position Encodings · Label Smoothing · Cosine Annealing · Position-Wise Feed-Forward Layer · Linear Layer · Residual Connection · Multi-Head Attention · Weight Decay