A Framework for Evaluation of Machine Reading Comprehension Gold Standards
Viktor Schlegel, Marco Valentino, Andr\'e Freitas, Goran Nenadic, Riza, Batista-Navarro

TL;DR
This paper introduces a comprehensive framework to evaluate the quality and linguistic features of Machine Reading Comprehension gold standards, addressing challenges in data design and assessment reliability.
Contribution
It proposes a unifying framework with annotation schemas and metrics to systematically analyze MRC datasets for linguistic features, reasoning, knowledge, and lexical cues.
Findings
Presence of lexical cues in datasets potentially lowers comprehension difficulty.
Varying factual correctness of answers impacts evaluation reliability.
Lack of features contributing to lexical ambiguity in current standards.
Abstract
Machine Reading Comprehension (MRC) is the task of answering a question over a paragraph of text. While neural MRC systems gain popularity and achieve noticeable performance, issues are being raised with the methodology used to establish their performance, particularly concerning the data design of gold standards that are used to evaluate them. There is but a limited understanding of the challenges present in this data, which makes it hard to draw comparisons and formulate reliable hypotheses. As a first step towards alleviating the problem, this paper proposes a unifying framework to systematically investigate the present linguistic features, required reasoning and background knowledge and factual correctness on one hand, and the presence of lexical cues as a lower bound for the requirement of understanding on the other hand. We propose a qualitative annotation schema for the first and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
