LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
Boshi Wang, Hao Fang, Jason Eisner, Benjamin Van Durme, Yu Su

TL;DR
This paper introduces a biologically inspired trial-and-error method called STE that significantly enhances large language models' ability to accurately use tools, outperforming existing models and improving reliability in external environment interactions.
Contribution
The paper proposes STE, a novel trial-and-error approach inspired by biological systems, to improve tool use accuracy in LLMs through imagination, memory, and interaction feedback.
Findings
STE improves tool use accuracy by 46.7% on ToolBench.
STE enables LLMs to outperform GPT-4 in tool use tasks.
Effective continual learning of tools via experience replay.
Abstract
Tools are essential for large language models (LLMs) to acquire up-to-date information and take consequential actions in external environments. Existing work on tool-augmented LLMs primarily focuses on the broad coverage of tools and the flexibility of adding new tools. However, a critical aspect that has surprisingly been understudied is simply how accurately an LLM uses tools for which it has been trained. We find that existing LLMs, including GPT-4 and open-source LLMs specifically fine-tuned for tool use, only reach a correctness rate in the range of 30% to 60%, far from reliable use in practice. We propose a biologically inspired method for tool-augmented LLMs, simulated trial and error (STE), that orchestrates three key mechanisms for successful tool use behaviors in the biological system: trial and error, imagination, and memory. Specifically, STE leverages an LLM's 'imagination'…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law · Legal Education and Practice Innovations · Law, AI, and Intellectual Property
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Dropout · Softmax · Dense Connections · Label Smoothing · Adam · Absolute Position Encodings · Residual Connection
