How Sharp and Bias-Robust is a Model? Dual Evaluation Perspectives on Knowledge Graph Completion
Sooho Moon, Yunyong Ko

TL;DR
This paper introduces PROBE, a new evaluation framework for knowledge graph completion that considers predictive sharpness and popularity-bias robustness, providing more reliable and comprehensive model assessments.
Contribution
It proposes a novel evaluation framework with rank transformer and rank aggregator components to better assess KGC models from multiple perspectives.
Findings
Existing metrics often misestimate KGC model accuracy.
PROBE provides more reliable and comprehensive evaluation results.
Experiments on real-world KGs demonstrate PROBE's effectiveness.
Abstract
Knowledge graph completion (KGC) aims to predict missing facts from the observed KG. While a number of KGC models have been studied, the evaluation of KGC still remain underexplored. In this paper, we observe that existing metrics overlook two key perspectives for KGC evaluation: (A1) predictive sharpness -- the degree of strictness in evaluating an individual prediction, and (A2) popularity-bias robustness -- the ability to predict low-popularity entities. Toward reflecting both perspectives, we propose a novel evaluation framework (PROBE), which consists of a rank transformer (RT) estimating the score of each prediction based on a required level of predictive sharpness and a rank aggregator (RA) aggregating all the scores in a popularity-aware manner. Experiments on real-world KGs reveal that existing metrics tend to over- or under-estimate the accuracy of KGC models, whereas PROBE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Data Quality and Management
