ART: Automatic multi-step reasoning and tool-use for large language models
Bhargavi Paranjape, Scott Lundberg, Sameer Singh, Hannaneh Hajishirzi,, Luke Zettlemoyer, Marco Tulio Ribeiro

TL;DR
ART is a framework that enables large language models to perform multi-step reasoning and tool use automatically, improving performance on complex tasks without extensive manual prompt engineering.
Contribution
The paper introduces ART, a novel method for automatic multi-step reasoning and tool use in LLMs, reducing manual effort and enhancing task performance.
Findings
Significant performance improvements on BigBench and MMLU benchmarks.
Matches hand-crafted CoT prompts on most tasks.
Easy human correction and extension of reasoning programs.
Abstract
Large language models (LLMs) can perform complex reasoning in few- and zero-shot settings by generating intermediate chain of thought (CoT) reasoning steps. Further, each reasoning step can rely on external tools to support computation beyond the core LLM capabilities (e.g. search/running code). Prior work on CoT prompting and tool use typically requires hand-crafting task-specific demonstrations and carefully scripted interleaving of model generations with tool use. We introduce Automatic Reasoning and Tool-use (ART), a framework that uses frozen LLMs to automatically generate intermediate reasoning steps as a program. Given a new task to solve, ART selects demonstrations of multi-step reasoning and tool use from a task library. At test time, ART seamlessly pauses generation whenever external tools are called, and integrates their output before resuming generation. ART achieves a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗EpistemeAI/Fireball-Meta-Llama-3.1-8B-Instruct-Agent-0.003-128K-code-ds-automodel· ♡ 8♡ 8
- 🤗EpistemeAI/Fireball-Meta-Llama-3.1-8B-Instruct-Agent-0.004-128K-code-ds-automodel
- 🤗EpistemeAI/Polypsyche-Llama-3.1-8B-Instruct-Agent-0.003-128K-code-ds-auto-Logicmodel· ♡ 1♡ 1
- 🤗EpistemeAI/Polypsyche-Llama-3.1-8B-Instruct-Agent-0.003-128K-code-ds-auto-Empathymodel
- 🤗EpistemeAI/Polypsyche-Llama-3.1-8B-Instruct-Agent-0.003-128K-code-ds-auto-divergentmodel
- 🤗EpistemeAI/Polypsyche-Llama-3.1-8B-Instruct-Agent-0.0031-128K-code-ds-auto-Logicmodel· ♡ 1♡ 1
- 🤗EpistemeAI/Polypsyche-Llama-3.1-8B-Instruct-Agent-0.0031-128K-code-ds-auto-divergentmodel
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
MethodsChain-of-thought prompting · Test
