OAG-BERT: Towards A Unified Backbone Language Model For Academic Knowledge Services
Xiao Liu, Da Yin, Jingnan Zheng, Xingjian Zhang, Peng Zhang, Hongxia, Yang, Yuxiao Dong, Jie Tang

TL;DR
OAG-BERT is a unified academic language model that integrates entity knowledge and scientific texts, enabling zero-shot inference to reduce annotation costs and support various academic applications.
Contribution
The paper introduces OAG-BERT, a novel pre-trained model combining heterogeneous academic knowledge and texts, with strategies for zero-shot inference to enhance academic knowledge services.
Findings
OAG-BERT effectively supports reviewer recommendation and paper tagging.
Zero-shot inference reduces the need for expensive annotations.
The model is deployed in real-world academic applications.
Abstract
Academic knowledge services have substantially facilitated the development of the science enterprise by providing a plenitude of efficient research tools. However, many applications highly depend on ad-hoc models and expensive human labeling to understand scientific contents, hindering deployments into real products. To build a unified backbone language model for different knowledge-intensive academic applications, we pre-train an academic language model OAG-BERT that integrates both the heterogeneous entity knowledge and scientific corpora in the Open Academic Graph (OAG) -- the largest public academic graph to date. In OAG-BERT, we develop strategies for pre-training text and entity data along with zero-shot inference techniques. In OAG-BERT, we develop strategies for pre-training text and entity data along with zero-shot inference techniques. Its zero-shot capability furthers the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Graph Neural Networks · Data Quality and Management
