On the Ambiguity of Rank-Based Evaluation of Entity Alignment or Link   Prediction Methods

Max Berrendorf; Evgeniy Faerman; Laurent Vermue; Volker Tresp

arXiv:2002.06914·cs.LG·September 21, 2023·5 cites

On the Ambiguity of Rank-Based Evaluation of Entity Alignment or Link Prediction Methods

Max Berrendorf, Evgeniy Faerman, Laurent Vermue, Volker Tresp

PDF

Open Access 1 Repo

TL;DR

This paper critically examines the limitations of current rank-based evaluation metrics for knowledge graph tasks, revealing issues with comparability and interpretation, and proposes adjustments for fairer assessment.

Contribution

It identifies shortcomings of existing evaluation scores and introduces modifications to improve fairness, comparability, and interpretability in model performance assessment.

Findings

01

Existing scores are inadequate for cross-dataset comparison.

02

Test size impacts model performance metrics.

03

Proposed evaluation adjustments improve fairness and interpretability.

Abstract

In this work, we take a closer look at the evaluation of two families of methods for enriching information from knowledge graphs: Link Prediction and Entity Alignment. In the current experimental setting, multiple different scores are employed to assess different aspects of model performance. We analyze the informativeness of these evaluation measures and identify several shortcomings. In particular, we demonstrate that all existing scores can hardly be used to compare results across different datasets. Moreover, we demonstrate that varying size of the test size automatically has impact on the performance of the same model based on commonly used metrics for the Entity Alignment task. We show that this leads to various problems in the interpretation of results, which may support misleading conclusions. Therefore, we propose adjustments to the evaluation and demonstrate empirically how…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mberr/rank-based-evaluation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Graph Neural Networks · Data Quality and Management · Topic Modeling

MethodsTest