Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLM Tool Use

Yize Cheng; Chenrui Fan; Mahdi JafariRaviz; Keivan Rezaei; Soheil Feizi

arXiv:2605.14038·cs.AI·May 19, 2026

Model-Adaptive Tool Necessity Reveals the Knowing-Doing Gap in LLM Tool Use

Yize Cheng, Chenrui Fan, Mahdi JafariRaviz, Keivan Rezaei, Soheil Feizi

PDF

1 Repo 1 Datasets

TL;DR

This paper investigates the discrepancy between models' perceived necessity of external tools and their actual tool-using behavior in LLMs, revealing a knowing-doing gap and analyzing the underlying causes.

Contribution

It introduces a model-adaptive definition of tool necessity, compares it with observed behavior across models, and diagnoses the cognitive-to-action transition failure.

Findings

01

Substantial mismatch (26.5-54.0%) in tool necessity across models.

02

Both internal cognition and execution signals are linearly decodable.

03

Most mismatch occurs in the transition from recognizing necessity to acting on it.

Abstract

Large language models (LLMs) increasingly act as autonomous agents that must decide when to answer directly vs. when to invoke external tools. Prior work studying adaptive tool use has largely treated tool necessity as a model-agnostic property, annotated by human or LLM judge, and mostly cover cases where the answer is obvious (e.g., fetching the weather vs. paraphrasing text). However, tool necessity in the wild is more nuanced due to the divergence of capability boundaries across models: a problem solvable by a strong model on its own may still require tools for a weaker one. In this work, we introduce a model-adaptive definition of tool-necessity, grounded in each model's empirical performance. Following this definition, we compare the necessity against observed tool-call behavior across four models on arithmetic and factual QA dataset, and find substantial mismatches of 26.5-54.0%…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chengez/Tool-Cognition-Action
github

Datasets

yizecheng/model-adaptive-tool-necessity
dataset· 37 dl
37 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.