Know What You Don't Know: Unanswerable Questions for SQuAD

Pranav Rajpurkar; Robin Jia; and Percy Liang

arXiv:1806.03822·cs.CL·June 12, 2018·214 cites

Know What You Don't Know: Unanswerable Questions for SQuAD

Pranav Rajpurkar, Robin Jia, and Percy Liang

PDF

Open Access 5 Repos 7 Models 5 Datasets

TL;DR

SQuAD 2.0 introduces unanswerable questions into a reading comprehension dataset, challenging models to both answer correctly and identify when questions are unanswerable, thereby advancing natural language understanding.

Contribution

The paper presents SQuAD 2.0, a dataset combining answerable and unanswerable questions to improve model robustness in question answering tasks.

Findings

01

Existing models struggle with unanswerable questions.

02

Adding unanswerable questions decreases model F1 scores.

03

SQuAD 2.0 sets a new benchmark for answerability detection.

Abstract

Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context. Existing datasets either focus exclusively on answerable questions, or use automatically generated unanswerable questions that are easy to identify. To address these weaknesses, we present SQuAD 2.0, the latest version of the Stanford Question Answering Dataset (SQuAD). SQuAD 2.0 combines existing SQuAD data with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD 2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering. SQuAD 2.0 is a challenging natural language understanding task for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification