# Quoref: A Reading Comprehension Dataset with Questions Requiring   Coreferential Reasoning

**Authors:** Pradeep Dasigi, Nelson F. Liu, Ana Marasovi\'c, Noah A. Smith, Matt, Gardner

arXiv: 1908.05803 · 2019-09-06

## TL;DR

This paper introduces Quoref, a challenging reading comprehension dataset focused on coreferential reasoning, revealing significant gaps in current models' abilities compared to humans.

## Contribution

The paper presents Quoref, a novel dataset with 24K questions requiring coreference resolution, created using adversarial crowdsourcing to minimize superficial cues.

## Key findings

- State-of-the-art models perform significantly worse than humans on Quoref.
- The best model achieves 70.5 F1, while human performance is estimated at 93.4 F1.
- Quoref exposes limitations of current models in coreferential reasoning.

## Abstract

Machine comprehension of texts longer than a single sentence often requires coreference resolution. However, most current reading comprehension benchmarks do not contain complex coreferential phenomena and hence fail to evaluate the ability of models to resolve coreference. We present a new crowdsourced dataset containing more than 24K span-selection questions that require resolving coreference among entities in over 4.7K English paragraphs from Wikipedia. Obtaining questions focused on such phenomena is challenging, because it is hard to avoid lexical cues that shortcut complex reasoning. We deal with this issue by using a strong baseline model as an adversary in the crowdsourcing loop, which helps crowdworkers avoid writing questions with exploitable surface cues. We show that state-of-the-art reading comprehension models perform significantly worse than humans on this benchmark---the best model performance is 70.5 F1, while the estimated human performance is 93.4 F1.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.05803/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/1908.05803/full.md

## References

28 references — full list in the complete paper: https://tomesphere.com/paper/1908.05803/full.md

---
Source: https://tomesphere.com/paper/1908.05803