CoLAKE: Contextualized Language and Knowledge Embedding
Tianxiang Sun, Yunfan Shao, Xipeng Qiu, Qipeng Guo, Yaru Hu, Xuanjing, Huang, Zheng Zhang

TL;DR
CoLAKE introduces a unified framework for jointly learning contextualized language and knowledge embeddings using a word-knowledge graph, significantly improving performance on various knowledge-driven and language understanding tasks.
Contribution
It proposes a novel method to integrate language and knowledge representations in a unified model with a new data structure, enhancing contextualization and task performance.
Findings
Outperforms previous models on multiple tasks
Achieves high accuracy on word-knowledge graph completion
Demonstrates the effectiveness of joint contextualized embeddings
Abstract
With the emerging branch of incorporating factual knowledge into pre-trained language models such as BERT, most existing models consider shallow, static, and separately pre-trained entity embeddings, which limits the performance gains of these models. Few works explore the potential of deep contextualized knowledge representation when injecting knowledge. In this paper, we propose the Contextualized Language and Knowledge Embedding (CoLAKE), which jointly learns contextualized representation for both language and knowledge with the extended MLM objective. Instead of injecting only entity embeddings, CoLAKE extracts the knowledge context of an entity from large-scale knowledge bases. To handle the heterogeneity of knowledge context and language context, we integrate them in a unified data structure, word-knowledge graph (WK graph). CoLAKE is pre-trained on large-scale WK graphs with the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections · Dropout · Linear Warmup With Linear Decay · Layer Normalization · Attention Dropout · Byte Pair Encoding
