Answering Ambiguous Questions with a Database of Questions, Answers, and Revisions
Haitian Sun, William W. Cohen, Ruslan Salakhutdinov

TL;DR
This paper introduces a novel approach for answering ambiguous questions by leveraging a database of unambiguous questions derived from Wikipedia, significantly improving retrieval and disambiguation performance on the ASQA benchmark.
Contribution
The work presents a new state-of-the-art method that uses a database of generated questions to better handle ambiguity in open-domain question answering.
Findings
15% relative improvement in recall on ASQA benchmark
10% improvement in disambiguation accuracy
Enhanced passage retrieval through indirect matching
Abstract
Many open-domain questions are under-specified and thus have multiple possible answers, each of which is correct under a different interpretation of the question. Answering such ambiguous questions is challenging, as it requires retrieving and then reasoning about diverse information from multiple passages. We present a new state-of-the-art for answering ambiguous questions that exploits a database of unambiguous questions generated from Wikipedia. On the challenging ASQA benchmark, which requires generating long-form answers that summarize the multiple answers to an ambiguous question, our method improves performance by 15% (relative improvement) on recall measures and 10% on measures which evaluate disambiguating questions from predicted outputs. Retrieving from the database of generated questions also gives large improvements in diverse passage retrieval (by matching user questions q…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech and dialogue systems
