LLM Agents Making Agent Tools

Georg W\"olflein; Dyke Ferber; Daniel Truhn; Ognjen Arandjelovi\'c; Jakob Nikolas Kather

arXiv:2502.11705·cs.CL·June 2, 2025·3 cites

LLM Agents Making Agent Tools

Georg W\"olflein, Dyke Ferber, Daniel Truhn, Ognjen Arandjelovi\'c, Jakob Nikolas Kather

PDF

Open Access 1 Repo 1 Video

TL;DR

ToolMaker autonomously converts scientific papers with code into LLM-compatible tools, enabling autonomous, multi-domain scientific workflows with high correctness and robustness, reducing reliance on human-developed tools.

Contribution

It introduces ToolMaker, a framework that automatically transforms scientific papers with code into tools for LLM agents, advancing autonomous scientific workflows.

Findings

01

Correctly implements 80% of complex tasks

02

Outperforms current software engineering agents

03

Provides a new benchmark for tool correctness and robustness

Abstract

Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks by dynamically utilising external software components. However, these tools must be implemented in advance by human developers, hindering the applicability of LLM agents in domains demanding large numbers of highly specialised tools, like in life sciences and medicine. Motivated by the growing trend of scientific studies accompanied by public code repositories, we propose ToolMaker, an agentic framework that autonomously transforms papers with code into LLM-compatible tools. Given a GitHub URL and short task description, ToolMaker autonomously installs dependencies and generates code to perform the task, using a closed-loop self-correction mechanism for debugging. To evaluate our approach, we introduce a benchmark comprising 15 complex computational tasks spanning various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

katherlab/toolmaker
noneOfficial

Videos

LLM Agents Making Agent Tools· underline

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Scheduling and Optimization Algorithms · Business Process Modeling and Analysis

MethodsUmbrella Reinforcement Learning