The Provenance Gap in Clinical AI: Evidence-Traceable Temporal Knowledge Graphs for Rare Disease Reasoning
Md Shamim Ahmed, Maja Dusanic, Moritz Nikolai Kirschner, Elisabeth Nyoungui, Jana Zsch\"untzsch, Lukas Galke Poech, Richard R\"ottger

TL;DR
This paper introduces HEG-TKG, a system that provides verifiable, evidence-grounded citations for clinical AI outputs using temporal knowledge graphs, addressing the Provenance Gap in rare disease reasoning.
Contribution
The paper presents HEG-TKG, a novel system that ensures 100% verifiability of clinical claims with inline citations, outperforming baseline models in evidence traceability.
Findings
HEG-TKG achieves 100% evidence verifiability with inline citations.
Frontier LLMs rarely produce relevant PubMed identifiers without prompting.
Clinician evaluation confirms the verifiability advantage of HEG-TKG.
Abstract
Frontier large language models generate clinically accurate outputs, but their citations are often fabricated. We term this the Provenance Gap. We tested five frontier LLMs across 36 clinician-validated scenarios for three rare neuromuscular disease pairs. No model produced a clinically relevant PubMed identifier without prompting. When explicitly asked to cite, the best model achieved 15.3% relevant PMIDs; the majority resolved to real publications in unrelated fields. We present HEG-TKG (Hierarchical Evidence-Grounded Temporal Knowledge Graphs), a system that grounds clinical claims in temporal knowledge graphs built from 4,512 PubMed records and curated sources with quality-tier stratification and 1,280 disease-trajectory milestones. In a controlled three-arm comparison using the same synthesis model, HEG-TKG matches baseline clinical feature coverage while achieving 100% evidence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
