Deep Bidirectional Language-Knowledge Graph Pretraining
Michihiro Yasunaga, Antoine Bosselut, Hongyu Ren, Xikun Zhang,, Christopher D Manning, Percy Liang, Jure Leskovec

TL;DR
DRAGON is a self-supervised pretraining method that deeply fuses text and knowledge graph data to improve reasoning and question answering across multiple domains, achieving state-of-the-art results.
Contribution
It introduces a novel bidirectional pretraining approach that jointly learns from text and knowledge graphs at scale, enabling better reasoning capabilities.
Findings
+5% average gain on downstream tasks
+10% on complex reasoning questions
State-of-the-art results on BioNLP tasks
Abstract
Pretraining a language model (LM) on text has been shown to help various downstream NLP tasks. Recent works show that a knowledge graph (KG) can complement text data, offering structured background knowledge that provides a useful scaffold for reasoning. However, these works are not pretrained to learn a deep fusion of the two modalities at scale, limiting the potential to acquire fully joint representations of text and KG. Here we propose DRAGON (Deep Bidirectional Language-Knowledge Graph Pretraining), a self-supervised approach to pretraining a deeply joint language-knowledge foundation model from text and KG at scale. Specifically, our model takes pairs of text segments and relevant KG subgraphs as input and bidirectionally fuses information from both modalities. We pretrain this model by unifying two self-supervised reasoning tasks, masked language modeling and KG link prediction.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Natural Language Processing Techniques
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Label Smoothing · Dropout · Byte Pair Encoding · Adam · Dense Connections · Softmax
