Simple and Effective Multi-Paragraph Reading Comprehension

Christopher Clark; Matt Gardner

arXiv:1710.10723·cs.CL·November 8, 2017

Simple and Effective Multi-Paragraph Reading Comprehension

Christopher Clark, Matt Gardner

PDF

1 Repo

TL;DR

This paper introduces a method for adapting paragraph-level question answering models to handle entire documents by training them to produce calibrated confidence scores, resulting in significant performance improvements on document QA datasets.

Contribution

The paper proposes a shared-normalization training objective combined with a pipeline for document QA, enabling models to effectively answer questions on full documents.

Findings

01

Achieved 71.3 F1 on TriviaQA web dataset

02

Significant improvement over previous best system (56.7 F1)

03

Demonstrated strong performance across multiple datasets

Abstract

We consider the problem of adapting neural paragraph-level question answering models to the case where entire documents are given as input. Our proposed solution trains models to produce well calibrated confidence scores for their results on individual paragraphs. We sample multiple paragraphs from the documents during training, and use a shared-normalization training objective that encourages the model to produce globally correct output. We combine this method with a state-of-the-art pipeline for training models on document QA data. Experiments demonstrate strong performance on several document QA datasets. Overall, we are able to achieve a score of 71.3 F1 on the web portion of TriviaQA, a large improvement from the 56.7 F1 of the previous best system.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

allenai/document-qa
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.