Common Sense or World Knowledge? Investigating Adapter-Based Knowledge   Injection into Pretrained Transformers

Anne Lauscher; Olga Majewska; Leonardo F. R. Ribeiro; Iryna; Gurevych; Nikolai Rozanov; Goran Glava\v{s}

arXiv:2005.11787·cs.CL·October 13, 2020

Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers

Anne Lauscher, Olga Majewska, Leonardo F. R. Ribeiro, Iryna, Gurevych, Nikolai Rozanov, Goran Glava\v{s}

PDF

1 Repo

TL;DR

This paper explores adapter-based methods to inject external conceptual knowledge into BERT, aiming to enhance its reasoning capabilities without catastrophic forgetting, with mixed results on standard benchmarks.

Contribution

It introduces an adapter-based approach for integrating ConceptNet and OMCS knowledge into BERT, demonstrating significant improvements on inference tasks requiring conceptual understanding.

Findings

01

Adapter models outperform BERT on conceptual inference tasks by 15-20 points.

02

Overall results on GLUE are inconclusive, highlighting task-specific benefits.

03

Open source code and experiments are provided.

Abstract

Following the major success of neural language models (LMs) such as BERT or GPT-2 on a variety of language understanding tasks, recent work focused on injecting (structured) knowledge from external resources into these models. While on the one hand, joint pretraining (i.e., training from scratch, adding objectives based on external knowledge to the primary LM objective) may be prohibitively computationally expensive, post-hoc fine-tuning on external knowledge, on the other hand, may lead to the catastrophic forgetting of distributional knowledge. In this work, we investigate models for complementing the distributional knowledge of BERT with conceptual knowledge from ConceptNet and its corresponding Open Mind Common Sense (OMCS) corpus, respectively, using adapter training. While overall results on the GLUE benchmark paint an inconclusive picture, a deeper analysis reveals that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wluper/retrograph
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.