EMBRACE: Evaluation and Modifications for Boosting RACE

Mariia Zyrianova; Dmytro Kalpakchi; Johan Boye

arXiv:2305.08433·cs.CL·May 16, 2023·1 cites

EMBRACE: Evaluation and Modifications for Boosting RACE

Mariia Zyrianova, Dmytro Kalpakchi, Johan Boye

PDF

Open Access 1 Repo

TL;DR

This paper critically evaluates the RACE dataset for machine reading comprehension, analyzing question difficulty, justification bases, and biases, and identifies a high-quality subset to improve evaluation standards.

Contribution

It provides a detailed analysis of RACE's quality, identifies issues with question validity and bias, and proposes a high-quality subset for more reliable model evaluation.

Findings

01

Many MCQs do not meet basic comprehension requirements

02

Bases for answer justification are biased towards specific text parts

03

A high-quality subset of RACE is identified for better evaluation

Abstract

When training and evaluating machine reading comprehension models, it is very important to work with high-quality datasets that are also representative of real-world reading comprehension tasks. This requirement includes, for instance, having questions that are based on texts of different genres and require generating inferences or reflecting on the reading material. In this article we turn our attention to RACE, a dataset of English texts and corresponding multiple-choice questions (MCQs). Each MCQ consists of a question and four alternatives (of which one is the correct answer). RACE was constructed by Chinese teachers of English for human reading comprehension and is widely used as training material for machine reading comprehension models. By construction, RACE should satisfy the aforementioned quality requirements and the purpose of this article is to check whether they are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dkalpakchi/embrace
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification

MethodsTest