Assessing LLMs for Serendipity Discovery in Knowledge Graphs: A Case for Drug Repurposing
Mengying Wang, Chenhui Ma, Ao Jiao, Tuo Liang, Pengjun Lu, Shrinidhi Hegde, Yu Yin, Evren Gurkan-Cavusoglu, Yinghui Wu

TL;DR
This paper introduces SerenQA, a framework and benchmark for evaluating large language models' ability to discover surprising and valuable insights in knowledge graphs, with a focus on drug repurposing.
Contribution
It formally defines the serendipity-aware KGQA task, proposes a novel evaluation metric, and provides a benchmark dataset based on the Clinical Knowledge Graph.
Findings
LLMs excel at knowledge retrieval but struggle with serendipitous discovery.
The SerenQA benchmark reveals gaps in LLMs' ability to identify surprising insights.
The framework enables structured evaluation of serendipity in scientific KGQA tasks.
Abstract
Large Language Models (LLMs) have greatly advanced knowledge graph question answering (KGQA), yet existing systems are typically optimized for returning highly relevant but predictable answers. A missing yet desired capacity is to exploit LLMs to suggest surprise and novel ("serendipitious") answers. In this paper, we formally define the serendipity-aware KGQA task and propose the SerenQA framework to evaluate LLMs' ability to uncover unexpected insights in scientific KGQA tasks. SerenQA includes a rigorous serendipity metric based on relevance, novelty, and surprise, along with an expert-annotated benchmark derived from the Clinical Knowledge Graph, focused on drug repurposing. Additionally, it features a structured evaluation pipeline encompassing three subtasks: knowledge retrieval, subgraph reasoning, and serendipity exploration. Our experiments reveal that while state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Graph Neural Networks · Machine Learning in Healthcare · Topic Modeling
