Efficient Tool Use with Chain-of-Abstraction Reasoning

Silin Gao; Jane Dwivedi-Yu; Ping Yu; Xiaoqing Ellen Tan; Ramakanth; Pasunuru; Olga Golovneva; Koustuv Sinha; Asli Celikyilmaz; Antoine Bosselut,; Tianlu Wang

arXiv:2401.17464·cs.CL·January 9, 2025·5 cites

Efficient Tool Use with Chain-of-Abstraction Reasoning

Silin Gao, Jane Dwivedi-Yu, Ping Yu, Xiaoqing Ellen Tan, Ramakanth, Pasunuru, Olga Golovneva, Koustuv Sinha, Asli Celikyilmaz, Antoine Bosselut,, Tianlu Wang

PDF

Open Access

TL;DR

This paper introduces Chain-of-Abstraction, a method enabling large language models to plan and execute multi-step reasoning with tools more effectively, improving accuracy and efficiency across domains.

Contribution

It proposes a novel training approach where LLMs generate abstract reasoning chains before filling in specific knowledge, enhancing robustness and parallel tool usage.

Findings

01

Achieves ~6% higher QA accuracy on diverse datasets.

02

Increases inference speed by approximately 1.4 times.

03

Demonstrates improved generalization to out-of-distribution questions.

Abstract

To achieve faithful reasoning that aligns with human expectations, large language models (LLMs) need to ground their reasoning to real-world knowledge (e.g., web facts, math and physical rules). Tools help LLMs access this external knowledge, but there remains challenges for fine-tuning LLM agents (e.g., Toolformer) to invoke tools in multi-step reasoning problems, where inter-connected tool calls require holistic and efficient tool usage planning. In this work, we propose a new method for LLMs to better leverage tools in multi-step reasoning. Our method, Chain-of-Abstraction (CoA), trains LLMs to first decode reasoning chains with abstract placeholders, and then call domain tools to reify each reasoning chain by filling in specific knowledge. This planning with abstract chains enables LLMs to learn more general reasoning strategies, which are robust to shifts of domain knowledge…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Business Process Modeling and Analysis

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings