Are We Asking the Right Questions? On Ambiguity in Natural Language Queries for Tabular Data Analysis
Daniel Gomm, Cornelius Wolff, Madelon Hulsebos

TL;DR
This paper proposes a cooperative framework for handling ambiguity in natural language queries to tabular data, emphasizing user-system collaboration and providing insights for better interface design and evaluation.
Contribution
It introduces a novel framework that models ambiguity as a cooperative interaction, distinguishing between resolvable and unresolvable queries, and analyzes real datasets to inform interface development.
Findings
Uncontrolled mixing of query types hampers evaluation accuracy.
A shared responsibility model clarifies how systems and users handle ambiguity.
Framework guides future design and assessment of NL interfaces for data analysis.
Abstract
Natural language interfaces to tabular data must handle ambiguities inherent to queries. Instead of treating ambiguity as a deficiency, we reframe it as a feature of cooperative interaction where users are intentional about the degree to which they specify queries. We develop a principled framework based on a shared responsibility of query specification between user and system, distinguishing unambiguous and ambiguous cooperative queries, which systems can resolve through reasonable inference, from uncooperative queries that cannot be resolved. Applying the framework to evaluations for tabular question answering and analysis, we analyze queries in 15 datasets, and observe an uncontrolled mixing of query types neither adequate for evaluating a system's accuracy nor for evaluating interpretation capabilities. This conceptualization around cooperation in resolving queries informs how to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Natural Language Processing Techniques
