Scaling Knowledge Graph Embedding Models

Nasrullah Sheikh; Xiao Qin; Berthold Reinwald; Chuan Lei

arXiv:2201.02791·cs.LG·January 11, 2022·1 cites

Scaling Knowledge Graph Embedding Models

Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Chuan Lei

PDF

Open Access

TL;DR

This paper introduces a scalable training method for knowledge graph embedding models that significantly reduces training time while maintaining performance, enabling more efficient link prediction in large graphs.

Contribution

The paper presents a novel combination of partitioning, negative sampling, and mini-batch training strategies to scale GNN-based knowledge graph embedding models.

Findings

01

Achieves 16x speed-up on benchmark datasets

02

Maintains comparable performance to non-distributed methods

03

Reduces data transfer during training

Abstract

Developing scalable solutions for training Graph Neural Networks (GNNs) for link prediction tasks is challenging due to the high data dependencies which entail high computational cost and huge memory footprint. We propose a new method for scaling training of knowledge graph embedding models for link prediction to address these challenges. Towards this end, we propose the following algorithmic strategies: self-sufficient partitions, constraint-based negative sampling, and edge mini-batch training. Both, partitioning strategy and constraint-based negative sampling, avoid cross partition data transfer during training. In our experimental evaluation, we show that our scaling solution for GNN-based knowledge graph embedding models achieves a 16x speed up on benchmark datasets while maintaining a comparable model performance as non-distributed methods on standard metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Topic Modeling

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings