Exploring and Exploiting Multi-Granularity Representations for Machine   Reading Comprehension

Nuo Chen; Chenyu You

arXiv:2208.08750·cs.CL·August 19, 2022

Exploring and Exploiting Multi-Granularity Representations for Machine Reading Comprehension

Nuo Chen, Chenyu You

PDF

Open Access

TL;DR

This paper introduces ABA-Net, a novel model that leverages multi-granularity representations through capsule networks and self-attention for improved machine reading comprehension, achieving state-of-the-art results on benchmarks.

Contribution

Proposes ABA-Net, a new approach that adaptively exploits multi-level source representations using capsule networks and self-attention in MRC.

Findings

01

Achieves new state-of-the-art on SQuAD 1.0.

02

Effective on SQuAD 2.0 and COQA datasets.

03

Demonstrates the benefit of multi-granularity representations.

Abstract

Recently, the attention-enhanced multi-layer encoder, such as Transformer, has been extensively studied in Machine Reading Comprehension (MRC). To predict the answer, it is common practice to employ a predictor to draw information only from the final encoder layer which generates the coarse-grained representations of the source sequences, i.e., passage and question. The analysis shows that the representation of source sequence becomes more coarse-grained from finegrained as the encoding layer increases. It is generally believed that with the growing number of layers in deep neural networks, the encoding process will gather relevant information for each location increasingly, resulting in more coarse-grained representations, which adds the likelihood of similarity to other locations (referring to homogeneity). Such phenomenon will mislead the model to make wrong judgement and degrade the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications

MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Position-Wise Feed-Forward Layer · Residual Connection · Softmax · Dropout · Adam · Layer Normalization