Learning to Ask: When LLM Agents Meet Unclear Instruction

Wenxuan Wang; Juluan Shi; Zixuan Ling; Yuk-Kit Chan; Chaozheng Wang; Cheryl Lee; Youliang Yuan; Jen-tse Huang; Wenxiang Jiao; Michael R. Lyu

arXiv:2409.00557·cs.CL·April 30, 2026

Learning to Ask: When LLM Agents Meet Unclear Instruction

Wenxuan Wang, Juluan Shi, Zixuan Ling, Yuk-Kit Chan, Chaozheng Wang, Cheryl Lee, Youliang Yuan, Jen-tse Huang, Wenxiang Jiao, Michael R. Lyu

PDF

TL;DR

This paper introduces a new framework called Ask-when-Needed (AwN) that enables LLMs to ask clarifying questions when instructions are unclear, improving tool use under noisy real-world conditions.

Contribution

The paper presents the AwN framework and the NoisyToolBench benchmark, addressing the challenge of imperfect instructions in LLM tool utilization.

Findings

01

AwN significantly improves LLM performance on NoisyToolBench.

02

LLMs tend to hallucinate missed arguments due to their training objective.

03

Automated evaluation with ToolEvaluator effectively measures LLM accuracy and efficiency.

Abstract

Equipped with the capability to call functions, modern large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone. However, the effective execution of these tools relies heavily not just on the advanced capabilities of LLMs but also on precise user instructions, which often cannot be ensured in the real world. To evaluate the performance of LLMs tool-use under imperfect instructions, we meticulously examine the real-world instructions queried from users, analyze the error patterns, and build a challenging tool-use benchmark called Noisy ToolBench (NoisyToolBench). We find that due to the next-token prediction training objective, LLMs tend to arbitrarily generate the missed argument, which may lead to hallucinations and risks. To address this issue, we propose a novel framework, Ask-when-Needed (AwN), which prompts…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.