Do not Abstain! Identify and Solve the Uncertainty
Jingyu Liu, Jingquan Peng, xiaopeng Wu, Xubin Li, Tiezheng Ge, Bo Zheng, Yong Liu

TL;DR
This paper introduces ConfuseBench, a benchmark to evaluate LLMs' ability to recognize and address different sources of uncertainty, and proposes methods to improve their uncertainty handling and response quality.
Contribution
The paper presents ConfuseBench for systematic evaluation and introduces InteractDPO, a training method to enhance LLMs' capacity to identify and solve uncertainty sources.
Findings
LLMs often misattribute uncertainty to query ambiguity.
Current models struggle to identify the true source of uncertainty.
Proposed methods improve inquiry generation and uncertainty recognition.
Abstract
Despite the widespread application of Large Language Models (LLMs) across various domains, they frequently exhibit overconfidence when encountering uncertain scenarios, yet existing solutions primarily rely on evasive responses (e.g., "I don't know") overlooks the opportunity of identifying and addressing the uncertainty to generate more satisfactory responses. To systematically investigate and improve LLMs' ability of recognizing and addressing the source of uncertainty, we introduce \textbf{ConfuseBench}, a benchmark mainly focus on three types of uncertainty: document scarcity, limited capability, and query ambiguity. Experiments with ConfuseBench reveal that current LLMs struggle to accurately identify the root cause of uncertainty and solve it. They prefer to attribute uncertainty to query ambiguity while overlooking capability limitations, especially for those weaker models. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComplex Systems and Decision Making
