KGBoost: A Classification-based Knowledge Base Completion Method with   Negative Sampling

Yun-Cheng Wang; Xiou Ge; Bin Wang; C.-C. Jay Kuo

arXiv:2112.09340·cs.LG·April 8, 2022

KGBoost: A Classification-based Knowledge Base Completion Method with Negative Sampling

Yun-Cheng Wang, Xiou Ge, Bin Wang, C.-C. Jay Kuo

PDF

TL;DR

KGBoost reformulates knowledge base completion as a binary classification task using XGBoost, focusing on hard negative sampling to improve link prediction accuracy, especially in low-dimensional settings.

Contribution

It introduces a modular classification-based approach with hard negative sampling for knowledge base completion, outperforming existing methods on benchmark datasets.

Findings

01

Outperforms state-of-the-art methods on multiple datasets

02

Effective in low-dimensional settings with smaller models

03

Demonstrates the benefit of modular classification approach

Abstract

Knowledge base completion is formulated as a binary classification problem in this work, where an XGBoost binary classifier is trained for each relation using relevant links in knowledge graphs (KGs). The new method, named KGBoost, adopts a modularized design and attempts to find hard negative samples so as to train a powerful classifier for missing link prediction. We conduct experiments on multiple benchmark datasets, and demonstrate that KGBoost outperforms state-of-the-art methods across most datasets. Furthermore, as compared with models trained by end-to-end optimization, KGBoost works well under the low-dimensional setting so as to allow a smaller model size.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsBalanced Selection