The Effectiveness of Masked Language Modeling and Adapters for Factual Knowledge Injection
Sondre Wold

TL;DR
This paper explores injecting factual knowledge into large language models using adapter modules trained on ConceptNet, showing improved factual recall with minimal additional parameters.
Contribution
It introduces a method of using masked language modeling with adapters trained on ConceptNet to enhance factual knowledge in language models.
Findings
Adapter modules improve LAMA probe performance.
Minimal parameter increase (2.1%) yields significant gains.
Method is effective for factual knowledge injection.
Abstract
This paper studies the problem of injecting factual knowledge into large pre-trained language models. We train adapter modules on parts of the ConceptNet knowledge graph using the masked language modeling objective and evaluate the success of the method by a series of probing experiments on the LAMA probe. Mean P@K curves for different configurations indicate that the technique is effective, increasing the performance on subsets of the LAMA probe for large values of k by adding as little as 2.1% additional parameters to the original models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning and Data Classification
MethodsSoftmax · Tanh Activation · Adapter · Low-Rank Factorization-based Multi-Head Attention
