Evaluating Knowledge Graph Complexity via Semantic, Spectral, and Structural Metrics for Link Prediction
Haji Gul, Abul Ghani Naim, Ajaz Ahmad Bhat

TL;DR
This paper critically evaluates the Cumulative Spectral Gradient (CSG) metric for knowledge graph complexity, finding it unreliable, and proposes alternative structural and semantic metrics that better correlate with link prediction performance.
Contribution
The study demonstrates the limitations of CSG in KG link prediction and introduces more robust complexity metrics based on relation entropy and diversity.
Findings
CSG is sensitive to parametrisation and does not reliably scale with class number.
Relation entropy and relation diversity strongly correlate with link prediction difficulty.
Graph connectivity measures show positive correlation with certain performance metrics.
Abstract
Understanding dataset complexity is fundamental to evaluating and comparing link prediction models on knowledge graphs (KGs). While the Cumulative Spectral Gradient (CSG) metric, derived from probabilistic divergence between classes within a spectral clustering framework, has been proposed as a classifier agnostic complexity metric purportedly scaling with class cardinality and correlating with downstream performance, it has not been evaluated in KG settings so far. In this work, we critically examine CSG in the context of multi relational link prediction, incorporating semantic representations via transformer derived embeddings. Contrary to prior claims, we find that CSG is highly sensitive to parametrisation and does not robustly scale with the number of classes. Moreover, it exhibits weak or inconsistent correlation with standard performance metrics such as Mean Reciprocal Rank (MRR)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Data Quality and Management · Bioinformatics and Genomic Networks
