GAPMAP: Mapping Scientific Knowledge Gaps in Biomedical Literature Using Large Language Models
Nourah M Salem, Elizabeth White, Michael Bada, Lawrence Hunter

TL;DR
This paper explores how large language models can identify explicit and implicit knowledge gaps in biomedical literature, demonstrating their potential to support research and policy decisions through structured reasoning schemes.
Contribution
It introduces a novel task of inferring implicit knowledge gaps and proposes the TABI reasoning scheme, benchmarking various LLMs on biomedical literature for gap detection.
Findings
LLMs can effectively identify both explicit and implicit knowledge gaps.
Larger models tend to perform better in gap detection tasks.
The TABI scheme improves reasoning and validation of inferred gaps.
Abstract
Scientific progress is driven by the deliberate articulation of what remains unknown. This study investigates the ability of large language models (LLMs) to identify research knowledge gaps in the biomedical literature. We define two categories of knowledge gaps: explicit gaps, clear declarations of missing knowledge; and implicit gaps, context-inferred missing knowledge. While prior work has focused mainly on explicit gap detection, we extend this line of research by addressing the novel task of inferring implicit gaps. We conducted two experiments on almost 1500 documents across four datasets, including a manually annotated corpus of biomedical articles. We benchmarked both closed-weight models (from OpenAI) and open-weight models (Llama and Gemma 2) under paragraph-level and full-paper settings. To address the reasoning of implicit gaps inference, we introduce \textbf{\small TABI}, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
