Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?

Seungbin Yang; ChaeHun Park; Taehee Kim; Jaegul Choo

arXiv:2406.12307·cs.CL·August 5, 2025·1 cites

Can Tool-augmented Large Language Models be Aware of Incomplete Conditions?

Seungbin Yang, ChaeHun Park, Taehee Kim, Jaegul Choo

PDF

Open Access 1 Datasets

TL;DR

This paper investigates whether tool-augmented large language models can recognize incomplete or ambiguous conditions and appropriately decide when to refrain from using tools, addressing a key challenge for reliable real-world AI applications.

Contribution

The study introduces a benchmark dataset and a prompting strategy that improve LLMs' ability to identify incomplete conditions and make better tool-use decisions.

Findings

01

State-of-the-art LLMs often fail to recognize incomplete conditions.

02

The proposed prompting strategy significantly improves recognition accuracy.

03

Enhanced models make more appropriate and reliable tool-use decisions.

Abstract

Recent advancements in integrating large language models (LLMs) with tools have allowed the models to interact with real-world environments. However, these tool-augmented LLMs often encounter incomplete scenarios when users provide partial information or the necessary tools are unavailable. Recognizing and managing such scenarios is crucial for LLMs to ensure their reliability, but this exploration remains understudied. This study examines whether LLMs can identify incomplete conditions and appropriately determine when to refrain from using tools. To quantitatively evaluate this capability, we construct a new benchmark dataset where instances are systematically altered to simulate the ambiguous and incomplete conditions common in real-world interactions. Our experiments reveal that even state-of-the-art LLMs often struggle to identify these conditions, attempting to use tools without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

ddehun/ICT
dataset· 530 dl
530 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling