Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate

Ana Davila; Jacinto Colan; Yasuhisa Hasegawa

arXiv:2507.12370·cs.CL·July 17, 2025

Beyond Single Models: Enhancing LLM Detection of Ambiguity in Requests through Debate

Ana Davila, Jacinto Colan, Yasuhisa Hasegawa

PDF

Open Access

TL;DR

This paper presents a multi-agent debate framework that significantly improves large language models' ability to detect and resolve ambiguities in user requests, surpassing single-model performance especially for complex cases.

Contribution

Introduces a multi-agent debate framework with diverse LLM architectures to enhance ambiguity detection and resolution beyond traditional single-model approaches.

Findings

01

Debate framework improved Llama3-8B and Mistral-7B performance.

02

Mistral-7B-led debates achieved 76.7% success rate.

03

Framework is effective for complex ambiguities and consensus building.

Abstract

Large Language Models (LLMs) have demonstrated significant capabilities in understanding and generating human language, contributing to more natural interactions with complex systems. However, they face challenges such as ambiguity in user requests processed by LLMs. To address these challenges, this paper introduces and evaluates a multi-agent debate framework designed to enhance detection and resolution capabilities beyond single models. The framework consists of three LLM architectures (Llama3-8B, Gemma2-9B, and Mistral-7B variants) and a dataset with diverse ambiguities. The debate framework markedly enhanced the performance of Llama3-8B and Mistral-7B variants over their individual baselines, with Mistral-7B-led debates achieving a notable 76.7% success rate and proving particularly effective for complex ambiguities and efficient consensus. While acknowledging varying model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research