Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers
Gal Yona, Roee Aharoni, Mor Geva

TL;DR
This paper introduces GRANOLA QA, an evaluation framework for open-domain question answering that considers multi-granularity answers, and proposes DRAG, a decoding method that improves answer accuracy by aligning response granularity with model uncertainty.
Contribution
The paper presents GRANOLA QA and GRANOLA-EQ datasets, and introduces DRAG, a decoding algorithm that enhances answer accuracy by considering answer granularity and model uncertainty.
Findings
DRAG improves accuracy by nearly 20 points on GRANOLA-EQ.
Standard decoding often produces overly specific answers that are incorrect.
Multi-granularity evaluation reveals more knowledge in language models than standard methods.
Abstract
Factual questions typically can be answered correctly at different levels of granularity. For example, both ``August 4, 1961'' and ``1961'' are correct answers to the question ``When was Barack Obama born?''. Standard question answering (QA) evaluation protocols, however, do not explicitly take this into account and compare a predicted answer against answers of a single granularity level. In this work, we propose GRANOLA QA, a novel evaluation setting where a predicted answer is evaluated in terms of accuracy and informativeness against a set of multi-granularity answers. We present a simple methodology for enriching existing datasets with multi-granularity answers, and create GRANOLA-EQ, a multi-granularity version of the EntityQuestions dataset. We evaluate a range of decoding methods on GRANOLA-EQ, including a new algorithm, called Decoding with Response Aggregation (DRAG), that is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsSparse Evolutionary Training
