Strikingness-Aware Evaluation for Temporal Knowledge Graph Reasoning
Rikui Huang, Shengzhe Zhang, Wei Wei

TL;DR
This paper introduces a strikingness-aware evaluation framework for temporal knowledge graph reasoning, emphasizing the importance of predicting rare, outstanding events over trivial repetitions.
Contribution
It proposes a rule-based strikingness measuring framework and integrates it into evaluation metrics, improving the assessment of reasoning models on significant events.
Findings
All models perform worse on high-strikingness events.
Path-based models excel on low-strikingness events.
Ensemble methods tend to fit trivial events rather than reasoning.
Abstract
Temporal Knowledge Graph Reasoning (TKGR) aims at inferring missing (especially future) events from historical data. Current evaluation in TKGR uniformly weights all events, ignoring that most are trivial repetitions, which overestimate the true reasoning ability. Therefore, the rare outstanding events, whose prediction demands deeper reasoning, should be distinguished and emphasized. To this end, we propose a strikingness-aware evaluation framework, which introduces a rule-based strikingness measuring framework (RSMF) to quantify event strikingness by comparing its expected occurrence with peer events derived from temporal rules. Strikingness is then integrated as a weighting factor into metrics like weighted MRR and Hits@k. Experiments on four TKG benchmarks reveal: 1) All representative models perform worse as event strikingness increases, 2) Path-based methods excel on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
