ART: Automatic multi-step reasoning and tool-use for large language   models

Bhargavi Paranjape; Scott Lundberg; Sameer Singh; Hannaneh Hajishirzi,; Luke Zettlemoyer; Marco Tulio Ribeiro

arXiv:2303.09014·cs.CL·March 17, 2023·49 cites

ART: Automatic multi-step reasoning and tool-use for large language models

Bhargavi Paranjape, Scott Lundberg, Sameer Singh, Hannaneh Hajishirzi,, Luke Zettlemoyer, Marco Tulio Ribeiro

PDF

Open Access 2 Repos 7 Models

TL;DR

ART is a framework that enables large language models to perform multi-step reasoning and tool use automatically, improving performance on complex tasks without extensive manual prompt engineering.

Contribution

The paper introduces ART, a novel method for automatic multi-step reasoning and tool use in LLMs, reducing manual effort and enhancing task performance.

Findings

01

Significant performance improvements on BigBench and MMLU benchmarks.

02

Matches hand-crafted CoT prompts on most tasks.

03

Easy human correction and extension of reasoning programs.

Abstract

Large language models (LLMs) can perform complex reasoning in few- and zero-shot settings by generating intermediate chain of thought (CoT) reasoning steps. Further, each reasoning step can rely on external tools to support computation beyond the core LLM capabilities (e.g. search/running code). Prior work on CoT prompting and tool use typically requires hand-crafting task-specific demonstrations and carefully scripted interleaving of model generations with tool use. We introduce Automatic Reasoning and Tool-use (ART), a framework that uses frozen LLMs to automatically generate intermediate reasoning steps as a program. Given a new task to solve, ART selects demonstrations of multi-step reasoning and tool use from a task library. At test time, ART seamlessly pauses generation whenever external tools are called, and integrates their output before resuming generation. ART achieves a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems

MethodsChain-of-thought prompting · Test