Exploring and Exploiting Multi-Granularity Representations for Machine Reading Comprehension
Nuo Chen, Chenyu You

TL;DR
This paper introduces ABA-Net, a novel model that leverages multi-granularity representations through capsule networks and self-attention for improved machine reading comprehension, achieving state-of-the-art results on benchmarks.
Contribution
Proposes ABA-Net, a new approach that adaptively exploits multi-level source representations using capsule networks and self-attention in MRC.
Findings
Achieves new state-of-the-art on SQuAD 1.0.
Effective on SQuAD 2.0 and COQA datasets.
Demonstrates the benefit of multi-granularity representations.
Abstract
Recently, the attention-enhanced multi-layer encoder, such as Transformer, has been extensively studied in Machine Reading Comprehension (MRC). To predict the answer, it is common practice to employ a predictor to draw information only from the final encoder layer which generates the coarse-grained representations of the source sequences, i.e., passage and question. The analysis shows that the representation of source sequence becomes more coarse-grained from finegrained as the encoding layer increases. It is generally believed that with the growing number of layers in deep neural networks, the encoding process will gather relevant information for each location increasingly, resulting in more coarse-grained representations, which adds the likelihood of similarity to other locations (referring to homogeneity). Such phenomenon will mislead the model to make wrong judgement and degrade the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsAttention Is All You Need · Linear Layer · Dense Connections · Label Smoothing · Position-Wise Feed-Forward Layer · Residual Connection · Softmax · Dropout · Adam · Layer Normalization
