Scaling Knowledge Graph Embedding Models
Nasrullah Sheikh, Xiao Qin, Berthold Reinwald, Chuan Lei

TL;DR
This paper introduces a scalable training method for knowledge graph embedding models that significantly reduces training time while maintaining performance, enabling more efficient link prediction in large graphs.
Contribution
The paper presents a novel combination of partitioning, negative sampling, and mini-batch training strategies to scale GNN-based knowledge graph embedding models.
Findings
Achieves 16x speed-up on benchmark datasets
Maintains comparable performance to non-distributed methods
Reduces data transfer during training
Abstract
Developing scalable solutions for training Graph Neural Networks (GNNs) for link prediction tasks is challenging due to the high data dependencies which entail high computational cost and huge memory footprint. We propose a new method for scaling training of knowledge graph embedding models for link prediction to address these challenges. Towards this end, we propose the following algorithmic strategies: self-sufficient partitions, constraint-based negative sampling, and edge mini-batch training. Both, partitioning strategy and constraint-based negative sampling, avoid cross partition data transfer during training. In our experimental evaluation, we show that our scaling solution for GNN-based knowledge graph embedding models achieves a 16x speed up on benchmark datasets while maintaining a comparable model performance as non-distributed methods on standard metrics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
