Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge   Graph Embedding Models Under a Unified Framework

Mehdi Ali; Max Berrendorf; Charles Tapley Hoyt; Laurent Vermue,; Mikhail Galkin; Sahand Sharifzadeh; Asja Fischer; Volker Tresp; Jens Lehmann

arXiv:2006.13365·cs.LG·November 2, 2021

Bringing Light Into the Dark: A Large-scale Evaluation of Knowledge Graph Embedding Models Under a Unified Framework

Mehdi Ali, Max Berrendorf, Charles Tapley Hoyt, Laurent Vermue,, Mikhail Galkin, Sahand Sharifzadeh, Asja Fischer, Volker Tresp, Jens Lehmann

PDF

2 Repos

TL;DR

This study systematically re-implements and evaluates 21 knowledge graph embedding models across multiple datasets, providing insights into reproducibility, best practices, and factors influencing model performance.

Contribution

It offers a large-scale, unified benchmarking framework for knowledge graph embeddings, highlighting the importance of configuration choices beyond architecture.

Findings

01

Model performance depends heavily on architecture, training, and inverse relation modeling.

02

Many models can achieve competitive results with proper configuration.

03

Reproducibility issues are prevalent, but can be mitigated with standardized implementations.

Abstract

The heterogeneity in recently published knowledge graph embedding models' implementations, training, and evaluation has made fair and thorough comparisons difficult. In order to assess the reproducibility of previously published results, we re-implemented and evaluated 21 interaction models in the PyKEEN software package. Here, we outline which results could be reproduced with their reported hyper-parameters, which could only be reproduced with alternate hyper-parameters, and which could not be reproduced at all as well as provide insight as to why this might be the case. We then performed a large-scale benchmarking on four datasets with several thousands of experiments and 24,804 GPU hours of computation time. We present insights gained as to best practices, best configurations for each model, and where improvements could be made over previously published best configurations. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.