PALRACE: Reading Comprehension Dataset with Human Data and Labeled   Rationales

Jiajie Zou; Yuran Zhang; Peiqing Jin; Cheng Luo; Xunyi Pan; Nai Ding

arXiv:2106.12373·cs.CL·March 25, 2022·5 cites

PALRACE: Reading Comprehension Dataset with Human Data and Labeled Rationales

Jiajie Zou, Yuran Zhang, Peiqing Jin, Cheng Luo, Xunyi Pan, Nai Ding

PDF

Open Access

TL;DR

PALRACE introduces a new reading comprehension dataset with human-labeled rationales, demonstrating that models significantly improve when trained with human rationales, especially benefiting simpler models and boosting performance by over 30%.

Contribution

The paper presents PALRACE, a novel dataset with human rationales for MRC, and shows how human rationales enhance model performance, particularly for simpler models.

Findings

01

Models outperform human readers on all question types.

02

Access to human rationales improves performance by over 30%.

03

Simpler models can match BERT-base performance with rationales.

Abstract

Pre-trained language models achieves high performance on machine reading comprehension (MRC) tasks but the results are hard to explain. An appealing approach to make models explainable is to provide rationales for its decision. To investigate whether human rationales can further improve current models and to facilitate supervised learning of human rationales, here we present PALRACE (Pruned And Labeled RACE), a new MRC dataset with human labeled rationales for 800 passages selected from the RACE dataset. We further classified the question to each passage into 6 types. Each passage was read by at least 26 human readers, who labeled their rationales to answer the question. It is demonstrated that models such as RoBERTa-large outperforms human readers in all 6 types of questions, including inference questions, but its performance can be further improved when having access to the human…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsGloVe Embeddings