TL;DR
This paper demonstrates that incorporating rich contextualized word representations from a large pre-trained language model significantly improves reading comprehension performance, achieving results comparable to state-of-the-art methods on SQuAD.
Contribution
It introduces the use of large pre-trained language models for contextualized word representations in reading comprehension tasks, highlighting the importance of context even when question and document are processed independently.
Findings
Significant performance improvements on SQuAD dataset.
Contextualized representations outperform traditional embeddings.
Model achieves state-of-the-art results with independent question and document processing.
Abstract
Reading a document and extracting an answer to a question about its content has attracted substantial attention recently. While most work has focused on the interaction between the question and the document, in this work we evaluate the importance of context when the question and document are processed independently. We take a standard neural architecture for this task, and show that by providing rich contextualized word representations from a large pre-trained language model as well as allowing the model to choose between context-dependent and context-independent word representations, we can obtain dramatic improvements and reach performance comparable to state-of-the-art on the competitive SQuAD dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
