Decomposing Uncertainty in Probabilistic Knowledge Graph Embeddings: Why Entity Variance Is Not Enough
Chorok Lee

TL;DR
This paper identifies the limitations of entity variance in probabilistic knowledge graph embeddings for out-of-distribution detection and proposes a method combining semantic and structural uncertainties, significantly improving detection performance.
Contribution
It introduces a formal decomposition of uncertainty into semantic and structural components and develops CAGP, a method that effectively combines these signals for better out-of-distribution detection.
Findings
CAGP achieves 0.94-0.99 AUROC on temporal OOD detection.
Existing methods achieve only 0.52-0.64 AUROC on temporal shifts.
Complete frequency overlap on three benchmark datasets.
Abstract
Probabilistic knowledge graph embeddings represent entities as distributions, using learned variances to quantify epistemic uncertainty. We identify a fundamental limitation: these variances are relation-agnostic, meaning an entity receives identical uncertainty regardless of relational context. This conflates two distinct out-of-distribution phenomena that behave oppositely: emerging entities (rare, poorly-learned) and novel relational contexts (familiar entities in unobserved relationships). We prove an impossibility result: any uncertainty estimator using only entity-level statistics independent of relation context achieves near-random OOD detection on novel contexts. We empirically validate this on three datasets, finding 100 percent of novel-context triples have frequency-matched in-distribution counterparts. This explains why existing probabilistic methods achieve 0.99 AUROC on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Topic Modeling · Machine Learning in Healthcare
