# ConceptDrift: leveraging spatial, temporal and semantic evolution of biomedical concepts for hypothesis generation

**Authors:** Amir Hassan Shariatmadari, Alireza Jafari, Sikun Guo, Sneha Srinivasan, Nathan C Sheffield, Aidong Zhang, Kishlay Jha

PMC · DOI: 10.1093/bioinformatics/btaf563 · 2025-10-28

## TL;DR

This paper introduces ConceptDrift, a new framework that uses the evolution of biomedical concepts to generate better scientific hypotheses.

## Contribution

ConceptDrift is the first framework to integrate spatial, temporal, and semantic evolution of biomedical concepts into a unified hypothesis generation system.

## Key findings

- ConceptDrift outperforms existing methods in generating accurate and meaningful hypotheses.
- The framework captures concept evolution from multiple perspectives, improving hypothesis plausibility.
- It offers practical benefits for biomedical literature mining tools.

## Abstract

Hypothesis generation is a fundamental problem in biomedical text mining that aims to generate ideas that are new, interesting, and plausible by discovering unexplored links between biomedical concepts. Despite significant advances made by existing approaches, they do not fully leverage the evolutionary properties of biomedical concepts. This is limiting because scientific knowledge continually evolves over time, with new facts being added and old ones becoming obsolete. Thus, it is crucial to capture the evolutionary properties of biomedical concepts from multiple perspectives (e.g. spatial, temporal, and semantic) to generate hypotheses that reflect the up-to-date information landscape of the biomedical domain.

We introduce a novel framework, ConceptDrift, that models the hypothesis generation task as a sequence of temporal graphlets and simultaneously encodes spatial, temporal, and semantic change. Unlike existing approaches that treat these dimensions independently, ConceptDrift is the first to provide a holistic understanding of concept evolution by integrating them into a unified framework. Grounded in the theories of the Distributional Hypothesis and Conceptual Change, our method adapts these principles to the unique challenges of large-scale biomedical literature. We conduct extensive experiments across multiple datasets and demonstrate that ConceptDrift consistently outperforms state-of-the-art baselines in generating accurate and meaningful hypotheses. Our framework shows immediate practical benefits for web-based literature mining tools in life sciences and biomedicine, offering more robust and predictive feature representations.

https://github.com/amir-hassan25/ConceptDrift (DOI: 10.6084/m9.figshare.29975476).

## Full-text entities

- **Diseases:** Neoplasms (MESH:D009369), Raynaud's disease (MESH:D011928), Lysosomal storage diseases (MESH:D016464), Amyotrophic lateral sclerosis (MESH:D000690), TGN (MESH:C536956), PLM (MESH:D000095027), Parkinson Disease (MESH:D010300)
- **Chemicals:** Cytarabine (MESH:D003561), Cycloheximide (MESH:D003513)

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12582365/full.md

---
Source: https://tomesphere.com/paper/PMC12582365