REALM: Retrieval-Augmented Language Model Pre-Training

Kelvin Guu; Kenton Lee; Zora Tung; Panupong Pasupat; Ming-Wei Chang

arXiv:2002.08909·cs.CL·February 21, 2020·515 cites

REALM: Retrieval-Augmented Language Model Pre-Training

Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, Ming-Wei Chang

PDF

Open Access 5 Repos 10 Models 1 Video

TL;DR

REALM introduces a retrieval-augmented pre-training method for language models that enhances knowledge access, interpretability, and modularity, significantly improving open-domain question answering performance.

Contribution

It presents a novel unsupervised pre-training approach for a knowledge retriever integrated with language models, enabling effective retrieval of documents during training and inference.

Findings

01

Outperforms previous models on open-domain QA benchmarks by 4-16% accuracy

02

First to pre-train a knowledge retriever in an unsupervised manner using masked language modeling

03

Provides benefits in interpretability and modularity of language models

Abstract

Language model pre-training has been shown to capture a surprising amount of world knowledge, crucial for NLP tasks such as question answering. However, this knowledge is stored implicitly in the parameters of a neural network, requiring ever-larger networks to cover more facts. To capture knowledge in a more modular and interpretable way, we augment language model pre-training with a latent knowledge retriever, which allows the model to retrieve and attend over documents from a large corpus such as Wikipedia, used during pre-training, fine-tuning and inference. For the first time, we show how to pre-train such a knowledge retriever in an unsupervised manner, using masked language modeling as the learning signal and backpropagating through a retrieval step that considers millions of documents. We demonstrate the effectiveness of Retrieval-Augmented Language Model pre-training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

REALM: Retrieval-Augmented Language Model Pre-Training (Paper Explained)· youtube

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsInterpretability