Referential ambiguity and clarification requests: comparing human and LLM behaviour
Chris Madge, Matthew Purver, Massimo Poesio

TL;DR
This study compares how humans and large language models ask clarification questions in task dialogues, revealing differences in their responses to ambiguity and the impact of reasoning abilities on clarification behavior.
Contribution
The paper introduces a new combined corpus for studying clarifications and ambiguity, and analyzes differences between human and LLM clarification strategies in dialogue.
Findings
Humans rarely ask clarifications for referential ambiguity.
LLMs ask more clarifications for referential ambiguity than humans.
Reasoning ability in LLMs increases the relevance and frequency of clarification questions.
Abstract
In this work we examine LLMs' ability to ask clarification questions in task-oriented dialogues that follow the asynchronous instruction-giver/instruction-follower format. We present a new corpus that combines two existing annotations of the Minecraft Dialogue Corpus -- one for reference and ambiguity in reference, and one for SDRT including clarifications -- into a single common format providing the necessary information to experiment with clarifications and their relation to ambiguity. With this corpus we compare LLM actions with original human-generated clarification questions, examining how both humans and LLMs act in the case of ambiguity. We find that there is only a weak link between ambiguity and humans producing clarification questions in these dialogues, and low correlation between humans and LLMs. Humans hardly ever produce clarification questions for referential ambiguity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Law
