Dynamic Coattention Networks For Question Answering

Caiming Xiong; Victor Zhong; Richard Socher

arXiv:1611.01604·cs.CL·March 8, 2018·338 cites

Dynamic Coattention Networks For Question Answering

Caiming Xiong, Victor Zhong, Richard Socher

PDF

Open Access 5 Repos

TL;DR

The paper introduces the Dynamic Coattention Network (DCN), a novel question answering model that iteratively refines answer spans, significantly improving accuracy on the Stanford dataset.

Contribution

The paper presents the DCN, which fuses question and document representations and uses a dynamic decoder to recover from local maxima, advancing question answering performance.

Findings

01

Single DCN achieves 75.9% F1 on Stanford dataset.

02

Ensemble of DCNs reaches 80.4% F1, surpassing previous state-of-the-art.

03

Model effectively recovers from initial incorrect answer predictions.

Abstract

Several deep learning models have been proposed for question answering. However, due to their single-pass nature, they have no way to recover from local maxima corresponding to incorrect answers. To address this problem, we introduce the Dynamic Coattention Network (DCN) for question answering. The DCN first fuses co-dependent representations of the question and the document in order to focus on relevant parts of both. Then a dynamic pointing decoder iterates over potential answer spans. This iterative procedure enables the model to recover from initial local maxima corresponding to incorrect answers. On the Stanford question answering dataset, a single DCN model improves the previous state of the art from 71.0% F1 to 75.9%, while a DCN ensemble obtains 80.4% F1.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications