Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?
Seungbin Yang, ChaeHun Park, Taehee Kim, Jaegul Choo

TL;DR
This paper investigates whether tool-augmented large language models can recognize incomplete or ambiguous conditions and appropriately decide when to refrain from using tools, addressing a key challenge for reliable real-world AI applications.
Contribution
The study introduces a benchmark dataset and a prompting strategy that improve LLMs' ability to identify incomplete conditions and make better tool-use decisions.
Findings
State-of-the-art LLMs often fail to recognize incomplete conditions.
The proposed prompting strategy significantly improves recognition accuracy.
Enhanced models make more appropriate and reliable tool-use decisions.
Abstract
Recent advancements in integrating large language models (LLMs) with tools have allowed the models to interact with real-world environments. However, these tool-augmented LLMs often encounter incomplete scenarios when users provide partial information or the necessary tools are unavailable. Recognizing and managing such scenarios is crucial for LLMs to ensure their reliability, but this exploration remains understudied. This study examines whether LLMs can identify incomplete conditions and appropriately determine when to refrain from using tools. To quantitatively evaluate this capability, we construct a new benchmark dataset where instances are systematically altered to simulate the ambiguous and incomplete conditions common in real-world interactions. Our experiments reveal that even state-of-the-art LLMs often struggle to identify these conditions, attempting to use tools without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
