Ask Again, Then Fail: Large Language Models' Vacillations in Judgment
Qiming Xie, Zengzhi Wang, Yi Feng, and Rui Xia

TL;DR
This paper investigates the inconsistency of large language models in maintaining correct judgments when faced with follow-up questions, proposing a new framework to improve their reliability and trustworthiness.
Contribution
It introduces a Follow-up Questioning Mechanism and a training framework Unwavering-FQ to reduce wavering in language model judgments.
Findings
The wavering issue is widespread in current models.
Prompting strategies can partially mitigate judgment wavering.
Unwavering-FQ improves model consistency and judgment stability.
Abstract
We observe that current conversational language models often waver in their judgments when faced with follow-up questions, even if the original judgment was correct. This wavering presents a significant challenge for generating reliable responses and building user trust. To comprehensively assess this issue, we introduce a \textsc{Follow-up Questioning Mechanism} along with two metrics to quantify this inconsistency, confirming its widespread presence in current language models. To mitigate this issue, we explore various prompting strategies for closed-source models; moreover, we develop a training-based framework \textsc{Unwavering-FQ} that teaches language models to maintain their originally correct judgments through synthesized high-quality preference data. Our experimental results confirm the effectiveness of our framework and its ability to enhance the general capabilities of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Multimodal Machine Learning Applications
