OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning

Pan Lu; Bowen Chen; Sheng Liu; Rahul Thapa; Joseph Boen; James Zou

arXiv:2502.11271·cs.LG·April 15, 2026·2 cites

OctoTools: An Agentic Framework with Extensible Tools for Complex Reasoning

Pan Lu, Bowen Chen, Sheng Liu, Rahul Thapa, Joseph Boen, James Zou

PDF

2 Repos

TL;DR

OctoTools is a versatile, training-free multi-agent framework that enhances complex reasoning across diverse domains by integrating standardized tools, planning, and execution, outperforming existing methods.

Contribution

It introduces standardized tool cards, a multi-level planner, and an executor, enabling effective, extensible reasoning without additional training across 16 diverse tasks.

Findings

01

Achieved 9.3% accuracy improvement over GPT-4o.

02

Outperformed AutoGen, GPT-Functions, and LangChain by up to 10.6%.

03

Demonstrated robustness and effectiveness in noisy and varied environments.

Abstract

Solving complex reasoning tasks may involve visual understanding, domain knowledge retrieval, numerical calculation, and multi-step reasoning. Existing methods augment large language models (LLMs) with external tools but are restricted to specialized domains, limited tool types, or require additional training data. In this paper, we introduce OctoTools, a training-free, user-friendly, and easily extensible multi-agent framework designed to tackle complex reasoning across diverse domains. OctoTools introduces standardized tool cards to encapsulate tool functionality, a planner for both high-level and low-level planning, and an executor to carry out tool usage. We validate OctoTools' generality across 16 diverse tasks (including MathVista, MMLU-Pro, MedQA, and GAIA-Text), achieving substantial average accuracy gains of 9.3% over GPT-4o. Furthermore, OctoTools also outperforms AutoGen,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.