Hephaestus: Improving Fundamental Agent Capabilities of Large Language   Models through Continual Pre-Training

Yuchen Zhuang; Jingfeng Yang; Haoming Jiang; Xin Liu; Kewei Cheng,; Sanket Lokegaonkar; Yifan Gao; Qing Ping; Tianyi Liu; Binxuan Huang; Zheng; Li; Zhengyang Wang; Pei Chen; Ruijie Wang; Rongzhi Zhang; Nasser Zalmout,; Priyanka Nigam; Bing Yin; Chao Zhang

arXiv:2502.06589·cs.CL·February 11, 2025

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training

Yuchen Zhuang, Jingfeng Yang, Haoming Jiang, Xin Liu, Kewei Cheng,, Sanket Lokegaonkar, Yifan Gao, Qing Ping, Tianyi Liu, Binxuan Huang, Zheng, Li, Zhengyang Wang, Pei Chen, Ruijie Wang, Rongzhi Zhang, Nasser Zalmout,, Priyanka Nigam, Bing Yin, Chao Zhang

PDF

Open Access 1 Video

TL;DR

Hephaestus introduces a large-scale, agent-specific pre-training corpus that significantly enhances LLMs' abilities in API calling, reasoning, and adaptation, outperforming existing models on key benchmarks.

Contribution

The paper presents Hephaestus-Forge, a novel 103B data corpus for continual pre-training that improves fundamental agent capabilities of large language models.

Findings

01

Hephaestus outperforms small- to medium-scale open-source LLMs.

02

Hephaestus rivals commercial LLMs on agent benchmarks.

03

Continual pre-training enhances generalization to new tasks.

Abstract

Due to the scarcity of agent-oriented pre-training data, LLM-based autonomous agents typically rely on complex prompting or extensive fine-tuning, which often fails to introduce new capabilities while preserving strong generalizability. We introduce Hephaestus-Forge, the first large-scale pre-training corpus designed to enhance the fundamental capabilities of LLM agents in API function calling, intrinsic reasoning and planning, and adapting to environmental feedback. Hephaestus-Forge comprises 103B agent-specific data encompassing 76,537 APIs, including both tool documentation to introduce knowledge of API functions and function calling trajectories to strengthen intrinsic reasoning. To explore effective training protocols, we investigate scaling laws to identify the optimal recipe in data mixing ratios. By continual pre-training on Hephaestus-Forge, Hephaestus outperforms small- to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training· underline

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques