Subword-augmented Embedding for Cloze Reading Comprehension
Zhuosheng Zhang, Yafang Huang, Hai Zhao

TL;DR
This paper introduces a subword-augmented embedding method for cloze reading comprehension, improving word representations by using subword units to handle rare words and enhance model performance.
Contribution
It proposes a novel subword-based embedding augmentation strategy and empirically demonstrates its effectiveness over character-based methods in reading comprehension tasks.
Findings
Significant performance improvement over baselines on multiple datasets
Effective handling of rare words with subword augmentation
Enhanced generalization ability of the reading model
Abstract
Representation learning is the foundation of machine reading comprehension. In state-of-the-art models, deep learning methods broadly use word and character level representations. However, character is not naturally the minimal linguistic unit. In addition, with a simple concatenation of character and word embedding, previous models actually give suboptimal solution. In this paper, we propose to use subword rather than character for word embedding enhancement. We also empirically explore different augmentation strategies on subword-augmented embedding to enhance the cloze-style reading comprehension model reader. In detail, we present a reader that uses subword-level representation to augment word embedding with a short list to handle rare words effectively. A thorough examination is conducted to evaluate the comprehensive performance and generalization ability of the proposed reader.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
