# Federated SPARQL query performance evaluation for exploring disease model mouse: combining gene expression, orthology, and disease knowledge graphs

**Authors:** Tatsuya Kushida, Tarcisio Mendes de Farias, Ana C. Sima, Christophe Dessimoz, Hirokazu Chiba, Frederic B. Bastian, Hiroshi Masuya

PMC · DOI: 10.1186/s12911-025-03013-8 · BMC Medical Informatics and Decision Making · 2025-05-16

## TL;DR

This paper evaluates federated SPARQL query performance for biomedical research using mouse models and gene expression data from multiple databases.

## Contribution

The study introduces a federated SPARQL approach to integrate and query disease-related data from multiple knowledge graphs for mouse model selection.

## Key findings

- Federated SPARQL queries identified 14 Alzheimer’s-related genes and 55 relevant genetically modified mouse bioresources.
- Melanoma-related genes and their anatomical expression were identified using transitive Uberon term searches.
- Query performance degradation in federated SPARQL was reduced by optimizing data transfer and server specifications.

## Abstract

The RIKEN BRC develops and maintains the RIKEN BioResource MetaDatabase to help users explore appropriate target bioresources for their experiments and prepare precise and high-quality data infrastructures. The Swiss Institute of Bioinformatics develops two databases across multi-species for the study of gene expression and orthology: Bgee and Orthologous MAtrix (OMA, an orthology database).

This study combines the RIKEN BioResource data with Resource Description Framework (RDF) datasets from Bgee, a gene expression database, the OMA, the DisGeNET, a human gene-disease association, Mouse Genome Informatics (MGI), UniProt, and four disease ontologies in the RIKEN BioResource MetaDatabase. Our aim is to evaluate the distributed SPARQL query performance when exploring which model organisms are most appropriate for specific medical science research applications across the aforementioned interoperable datasets. More precisely in our biomedical use cases, we investigate disease-related genes, as well as anatomical parts where these genes are expressed and subsequently identify appropriate bioresource candidates available for specific disease research applications.

We illustrate the above through two use cases targeting either Alzheimer’s disease or melanoma. We identified 14 Alzheimer’s disease-related genes that were expressed in the prefrontal cortex (e.g., APP and APOE) and 55 RIKEN bioresources, which were genetically modified mice related to these genes, predicted to be relevant to Alzheimer’s disease research. Furthermore, executing a transitive search for the Uberon terms by using the Property Paths function, we identified 14 melanoma-related genes (e.g., HRAS and PTEN), and 12 anatomical parts in which these genes were expressed, such as the “skin of limb” as an example. Finally, we compared the performance of the federated SPARQL query via the remote Bgee SPARQL endpoint with the performance of a centralized SPARQL query using the Bgee dataset as part of the RIKEN BioResource MetaDatabase.

As a result, we confirmed that the performance of the federated approach degraded. We concluded that we reduced the degradation of the query performance of the federated approach from the BioResource MetaDatabase to the SIB by refining the transferred data through a subquery and enhancing the server specifications thereby optimizing the triple store query evaluation.

The online version contains supplementary material available at 10.1186/s12911-025-03013-8.

## Linked entities

- **Genes:** APP (amyloid beta precursor protein) [NCBI Gene 351], APOE (apolipoprotein E) [NCBI Gene 348], HRAS (HRas proto-oncogene, GTPase) [NCBI Gene 3265], PTEN (phosphatase and tensin homolog) [NCBI Gene 5728]
- **Diseases:** Alzheimer’s disease (MONDO:0004975), melanoma (MONDO:0005105)
- **Species:** Mus musculus (taxon 10090)

## Full-text entities

- **Genes:** Pten (phosphatase and tensin homolog) [NCBI Gene 19211] {aka 2310035O07Rik, A130070J02Rik, B430203M17Rik, MMAC1, PTENbeta, TEP1}, Hras (Hras proto-oncogene, GTPase) [NCBI Gene 15461] {aka H-ras, Ha-ras, Harvey-ras, Hras-1, Hras1, Kras2}
- **Diseases:** melanoma (MESH:D008545), Alzheimer's disease (MESH:D000544)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12082848/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12082848/full.md

## References

35 references — full list in the complete paper: https://tomesphere.com/paper/PMC12082848/full.md

---
Source: https://tomesphere.com/paper/PMC12082848