A Comparative Study of Word Embeddings for Reading Comprehension
Bhuwan Dhingra, Hanxiao Liu, Ruslan Salakhutdinov, William W. Cohen

TL;DR
This paper demonstrates that the selection and handling of pre-trained word embeddings and out-of-vocabulary tokens significantly influence reading comprehension performance, often more than model architecture choices.
Contribution
It systematically evaluates the impact of embedding choices and out-of-vocabulary handling, providing practical recommendations for future research.
Findings
Pre-trained embeddings greatly affect performance.
Handling out-of-vocabulary tokens is crucial.
Minor choices can outweigh architectural innovations.
Abstract
The focus of past machine learning research for Reading Comprehension tasks has been primarily on the design of novel deep learning architectures. Here we show that seemingly minor choices made on (1) the use of pre-trained word embeddings, and (2) the representation of out-of-vocabulary tokens at test time, can turn out to have a larger impact than architectural choices on the final performance. We systematically explore several options for these choices, and provide recommendations to researchers working in this area.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques
