K-BERT: Enabling Language Representation with Knowledge Graph

Weijie Liu; Peng Zhou; Zhe Zhao; Zhiruo Wang; Qi Ju; Haotang Deng and; Ping Wang

arXiv:1909.07606·cs.CL·September 18, 2019·84 cites

K-BERT: Enabling Language Representation with Knowledge Graph

Weijie Liu, Peng Zhou, Zhe Zhao, Zhiruo Wang, Qi Ju, Haotang Deng and, Ping Wang

PDF

Open Access 2 Repos

TL;DR

K-BERT enhances language understanding by integrating knowledge graphs into BERT, improving performance on domain-specific NLP tasks through controlled knowledge injection.

Contribution

The paper introduces K-BERT, a novel model that incorporates knowledge graphs into BERT with mechanisms to mitigate knowledge noise, enabling effective domain knowledge integration without additional pre-training.

Findings

01

K-BERT outperforms BERT on twelve NLP tasks.

02

Significant improvements in finance, law, and medicine domains.

03

Effective knowledge injection without pre-training.

Abstract

Pre-trained language representation models, such as BERT, capture a general language representation from large-scale corpora, but lack domain-specific knowledge. When reading a domain text, experts make inferences with relevant knowledge. For machines to achieve this capability, we propose a knowledge-enabled language representation model (K-BERT) with knowledge graphs (KGs), in which triples are injected into the sentences as domain knowledge. However, too much knowledge incorporation may divert the sentence from its correct meaning, which is called knowledge noise (KN) issue. To overcome KN, K-BERT introduces soft-position and visible matrix to limit the impact of knowledge. K-BERT can easily inject domain knowledge into the models by equipped with a KG without pre-training by-self because it is capable of loading model parameters from the pre-trained BERT. Our investigation reveals…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsLinear Layer · Weight Decay · Residual Connection · Adam · Layer Normalization · Softmax · Attention Is All You Need · Dropout · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention