LLM Agents Making Agent Tools
Georg W\"olflein, Dyke Ferber, Daniel Truhn, Ognjen Arandjelovi\'c, Jakob Nikolas Kather

TL;DR
ToolMaker autonomously converts scientific papers with code into LLM-compatible tools, enabling autonomous, multi-domain scientific workflows with high correctness and robustness, reducing reliance on human-developed tools.
Contribution
It introduces ToolMaker, a framework that automatically transforms scientific papers with code into tools for LLM agents, advancing autonomous scientific workflows.
Findings
Correctly implements 80% of complex tasks
Outperforms current software engineering agents
Provides a new benchmark for tool correctness and robustness
Abstract
Tool use has turned large language models (LLMs) into powerful agents that can perform complex multi-step tasks by dynamically utilising external software components. However, these tools must be implemented in advance by human developers, hindering the applicability of LLM agents in domains demanding large numbers of highly specialised tools, like in life sciences and medicine. Motivated by the growing trend of scientific studies accompanied by public code repositories, we propose ToolMaker, an agentic framework that autonomously transforms papers with code into LLM-compatible tools. Given a GitHub URL and short task description, ToolMaker autonomously installs dependencies and generates code to perform the task, using a closed-loop self-correction mechanism for debugging. To evaluate our approach, we introduce a benchmark comprising 15 complex computational tasks spanning various…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Scheduling and Optimization Algorithms · Business Process Modeling and Analysis
MethodsUmbrella Reinforcement Learning
