Bidirectional Attention Flow for Machine Comprehension

Minjoon Seo; Aniruddha Kembhavi; Ali Farhadi; Hannaneh Hajishirzi

arXiv:1611.01603·cs.CL·June 22, 2018·1.3k cites

Bidirectional Attention Flow for Machine Comprehension

Minjoon Seo, Aniruddha Kembhavi, Ali Farhadi, Hannaneh Hajishirzi

PDF

Open Access 5 Repos 2 Models

TL;DR

This paper introduces the Bi-Directional Attention Flow (BIDAF) network, a novel model for machine comprehension that captures complex context-query interactions without early summarization, achieving state-of-the-art results on major datasets.

Contribution

The paper presents a new bi-directional attention flow mechanism and a hierarchical model that improves context-query interaction modeling in machine comprehension.

Findings

01

Achieves state-of-the-art results on SQuAD dataset.

02

Outperforms previous models on CNN/DailyMail cloze test.

03

Demonstrates effective multi-level context representation.

Abstract

Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query. Recently, attention mechanisms have been successfully extended to MC. Typically these methods use attention to focus on a small portion of the context and summarize it with a fixed-size vector, couple attentions temporally, and/or often form a uni-directional attention. In this paper we introduce the Bi-Directional Attention Flow (BIDAF) network, a multi-stage hierarchical process that represents the context at different levels of granularity and uses bi-directional attention flow mechanism to obtain a query-aware context representation without early summarization. Our experimental evaluations show that our model achieves the state-of-the-art results in Stanford Question Answering Dataset (SQuAD) and CNN/DailyMail cloze test.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Advanced Graph Neural Networks · Recommender Systems and Techniques