Self-Consistency of Large Language Models under Ambiguity

Henning Bartsch; Ole Jorgensen; Domenic Rosati; Jason; Hoelscher-Obermaier; Jacob Pfau

arXiv:2310.13439·cs.CL·October 23, 2023·2 cites

Self-Consistency of Large Language Models under Ambiguity

Henning Bartsch, Ole Jorgensen, Domenic Rosati, Jason, Hoelscher-Obermaier, Jacob Pfau

PDF

Open Access 1 Repo

TL;DR

This paper evaluates the self-consistency of large language models in ambiguous tasks, revealing that models tend to be internally multi-possibility aware and that self-consistency improves with capability, despite calibration issues.

Contribution

The authors introduce a benchmark for assessing self-consistency in LLMs under ambiguity and analyze models' behavior, robustness, and internal probability distributions.

Findings

01

Models achieve 67-82% consistency, higher than random chance.

02

Self-consistency increases with model capability.

03

Models often assign significant probability to alternative answers.

Abstract

Large language models (LLMs) that do not give consistent answers across contexts are problematic when used for tasks with expectations of consistency, e.g., question-answering, explanations, etc. Our work presents an evaluation benchmark for self-consistency in cases of under-specification where two or more answers can be correct. We conduct a series of behavioral experiments on the OpenAI model suite using an ambiguous integer sequence completion task. We find that average consistency ranges from 67\% to 82\%, far higher than would be predicted if a model's consistency was random, and increases as model capability improves. Furthermore, we show that models tend to maintain self-consistency across a series of robustness checks, including prompting speaker changes and sequence length changes. These results suggest that self-consistency arises as an emergent capability without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jacobpfau/introspective-self-consistency
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Algorithms