Iterative Alternating Neural Attention for Machine Reading

Alessandro Sordoni; Philip Bachman; Adam Trischler; Yoshua; Bengio

arXiv:1606.02245·cs.CL·November 10, 2016·36 cites

Iterative Alternating Neural Attention for Machine Reading

Alessandro Sordoni, Philip Bachman, Adam Trischler, Yoshua, Bengio

PDF

Open Access 1 Repo

TL;DR

This paper introduces an iterative alternating neural attention model for machine reading comprehension that explores both query and document in detail, outperforming previous models on standard benchmarks.

Contribution

It presents a novel neural attention architecture that iteratively attends to query and document separately, enhancing comprehension capabilities.

Findings

01

Outperforms state-of-the-art baselines on CNN and CBT datasets.

02

Demonstrates the effectiveness of iterative alternating attention in machine comprehension.

03

Achieves significant improvements in answer accuracy.

Abstract

We propose a novel neural attention architecture to tackle machine comprehension tasks, such as answering Cloze-style queries with respect to a document. Unlike previous models, we do not collapse the query into a single vector, instead we deploy an iterative alternating attention mechanism that allows a fine-grained exploration of both the query and the document. Our model outperforms state-of-the-art baselines in standard machine comprehension benchmarks such as CNN news articles and the Children's Book Test (CBT) dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

AI-metrics/AI-metrics
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Explainable Artificial Intelligence (XAI) · Natural Language Processing Techniques