TL;DR
The paper introduces the Gated-Attention (GA) Reader, a neural network model with a novel attention mechanism that improves performance on cloze-style question answering tasks over documents.
Contribution
It presents a new multi-hop neural network architecture with multiplicative attention interactions, achieving state-of-the-art results on multiple benchmarks.
Findings
Achieves state-of-the-art results on CNN, Daily Mail, and Who Did What datasets.
Demonstrates the effectiveness of multiplicative attention through ablation studies.
Shows that the proposed attention mechanism outperforms alternative methods.
Abstract
In this paper we study the problem of answering cloze-style questions over documents. Our model, the Gated-Attention (GA) Reader, integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader. This enables the reader to build query-specific representations of tokens in the document for accurate answer selection. The GA Reader obtains state-of-the-art results on three benchmarks for this task--the CNN \& Daily Mail news stories and the Who Did What dataset. The effectiveness of multiplicative interaction is demonstrated by an ablation study, and by comparing to alternative compositional operators for implementing the gated-attention. The code is available at https://github.com/bdhingra/ga-reader.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
