On the Ambiguity of Rank-Based Evaluation of Entity Alignment or Link Prediction Methods
Max Berrendorf, Evgeniy Faerman, Laurent Vermue, Volker Tresp

TL;DR
This paper critically examines the limitations of current rank-based evaluation metrics for knowledge graph tasks, revealing issues with comparability and interpretation, and proposes adjustments for fairer assessment.
Contribution
It identifies shortcomings of existing evaluation scores and introduces modifications to improve fairness, comparability, and interpretability in model performance assessment.
Findings
Existing scores are inadequate for cross-dataset comparison.
Test size impacts model performance metrics.
Proposed evaluation adjustments improve fairness and interpretability.
Abstract
In this work, we take a closer look at the evaluation of two families of methods for enriching information from knowledge graphs: Link Prediction and Entity Alignment. In the current experimental setting, multiple different scores are employed to assess different aspects of model performance. We analyze the informativeness of these evaluation measures and identify several shortcomings. In particular, we demonstrate that all existing scores can hardly be used to compare results across different datasets. Moreover, we demonstrate that varying size of the test size automatically has impact on the performance of the same model based on commonly used metrics for the Entity Alignment task. We show that this leads to various problems in the interpretation of results, which may support misleading conclusions. Therefore, we propose adjustments to the evaluation and demonstrate empirically how…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Data Quality and Management · Topic Modeling
MethodsTest
