Multi-range Reasoning for Machine Comprehension
Yi Tay, Luu Anh Tuan, Siu Cheung Hui

TL;DR
This paper introduces MRU, a fast and effective encoder for machine comprehension that captures long and short-term dependencies without recurrent layers, achieving competitive and state-of-the-art results on multiple datasets.
Contribution
The paper presents MRU, a novel multi-range gating encoder that improves sequence representation for machine comprehension without relying on recurrent or convolutional layers.
Findings
Outperforms DFN on RACE by 1.5%-6% without recurrence.
Achieves competitive results on SearchQA and NarrativeQA without LSTM/GRU.
Further improves performance when combined with BiLSTM architectures.
Abstract
We propose MRU (Multi-Range Reasoning Units), a new fast compositional encoder for machine comprehension (MC). Our proposed MRU encoders are characterized by multi-ranged gating, executing a series of parameterized contract-and-expand layers for learning gating vectors that benefit from long and short-term dependencies. The aims of our approach are as follows: (1) learning representations that are concurrently aware of long and short-term context, (2) modeling relationships between intra-document blocks and (3) fast and efficient sequence encoding. We show that our proposed encoder demonstrates promising results both as a standalone encoder and as well as a complementary building block. We conduct extensive experiments on three challenging MC datasets, namely RACE, SearchQA and NarrativeQA, achieving highly competitive performance on all. On the RACE benchmark, our model outperforms DFN…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
MethodsConvolution
