Text Understanding with the Attention Sum Reader Network

Rudolf Kadlec; Martin Schmid; Ondrej Bajgar; Jan Kleindienst

arXiv:1603.01547·cs.CL·June 27, 2016

Text Understanding with the Attention Sum Reader Network

Rudolf Kadlec, Martin Schmid, Ondrej Bajgar, Jan Kleindienst

PDF

2 Repos

TL;DR

The paper introduces the Attention Sum Reader Network, a simple attention-based model for cloze-style question answering that directly selects answers from the context, achieving state-of-the-art results on multiple datasets.

Contribution

It proposes a novel, straightforward attention mechanism that directly retrieves answers from the context, simplifying previous complex models.

Findings

01

Achieved new state-of-the-art results on CNN, Daily Mail, and Children's Book Test datasets.

02

Model is particularly effective for single-word answer questions from the context.

03

Ensemble of models improves performance further.

Abstract

Several large cloze-style context-question-answer datasets have been introduced recently: the CNN and Daily Mail news data and the Children's Book Test. Thanks to the size of these datasets, the associated text comprehension task is well suited for deep-learning techniques that currently seem to outperform all alternative approaches. We present a new, simple model that uses attention to directly pick the answer from the context as opposed to computing the answer using a blended representation of words in the document as is usual in similar models. This makes the model particularly suitable for question-answering problems where the answer is a single word from the document. Ensemble of our models sets new state of the art on all evaluated datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.