Attention-over-Attention Neural Networks for Reading Comprehension

Yiming Cui; Zhipeng Chen; Si Wei; Shijin Wang; Ting Liu; Guoping Hu

arXiv:1607.04423·cs.CL·September 2, 2019

Attention-over-Attention Neural Networks for Reading Comprehension

Yiming Cui, Zhipeng Chen, Si Wei, Shijin Wang, Ting Liu, Guoping Hu

PDF

2 Repos

TL;DR

This paper introduces an attention-over-attention neural network for Cloze-style reading comprehension, which improves performance by adding a second attention layer over document attention, outperforming previous models on benchmark datasets.

Contribution

The novel attention-over-attention mechanism enhances reading comprehension models with fewer hyper-parameters and a more elegant architecture, leading to significant performance gains.

Findings

01

Outperforms state-of-the-art models on CNN dataset

02

Achieves superior results on Children's Book Test dataset

03

Requires fewer hyper-parameters than previous models

Abstract

Cloze-style queries are representative problems in reading comprehension. Over the past few months, we have seen much progress that utilizing neural network approach to solve Cloze-style questions. In this paper, we present a novel model called attention-over-attention reader for the Cloze-style reading comprehension task. Our model aims to place another attention mechanism over the document-level attention, and induces "attended attention" for final predictions. Unlike the previous works, our neural network model requires less pre-defined hyper-parameters and uses an elegant architecture for modeling. Experimental results show that the proposed attention-over-attention model significantly outperforms various state-of-the-art systems by a large margin in public datasets, such as CNN and Children's Book Test datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.