A3Net: Adversarial-and-Attention Network for Machine Reading   Comprehension

Jiuniu Wang; Xingyu Fu; Guangluan Xu; Yirong Wu; Ziyan Chen; Yang Wei; and Li Jin

arXiv:1809.00676·cs.CL·September 5, 2018

A3Net: Adversarial-and-Attention Network for Machine Reading Comprehension

Jiuniu Wang, Xingyu Fu, Guangluan Xu, Yirong Wu, Ziyan Chen, Yang Wei, and Li Jin

PDF

Open Access

TL;DR

A3Net is a novel machine reading comprehension model that combines adversarial training on multiple target variables with a multi-layer attention mechanism, leading to improved robustness and state-of-the-art performance on WebQA.

Contribution

The paper introduces a3Net, integrating adversarial training on multiple targets and a multi-layer attention network for enhanced comprehension.

Findings

01

Outperforms state-of-the-art models on WebQA with a Fuzzy Score of 77.0%.

02

Adversarial training improves model robustness and generalization.

03

Multi-layer attention enhances question-passage interaction.

Abstract

In this paper, we introduce Adversarial-and-attention Network (A3Net) for Machine Reading Comprehension. This model extends existing approaches from two perspectives. First, adversarial training is applied to several target variables within the model, rather than only to the inputs or embeddings. We control the norm of adversarial perturbations according to the norm of original target variables, so that we can jointly add perturbations to several target variables during training. As an effective regularization method, adversarial training improves robustness and generalization of our model. Second, we propose a multi-layer attention network utilizing three kinds of high-efficiency attention mechanisms. Multi-layer attention conducts interaction between question and passage within each layer, which contributes to reasonable representation and understanding of the model. Combining these…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning