# GA4GH phenopacket-driven characterization of genotype-phenotype correlations in Mendelian disorders

**Authors:** Lauren Rekerle, Daniel Danis, Filip Rehburg, Adam S.L. Graefe, Viktor Bily, Andrés Caballero-Oteyza, Pilar Cacheiro, Leonardo Chimirri, Jessica X. Chong, Evan Connelly, Bert B.A. de Vries, Alexander J.M. Dingemans, Michael H. Duyzend, Tomas Freiberger, Petra Gehle, Tudor Groza, Peter Hansen, Julius O.B. Jacobsen, Adam Klocperk, Markus S. Ladewig, Michael I. Love, Allison J. Marcello, Alexander Mordhorst, Monica C. Munoz-Torres, Justin Reese, Catharina Schuetz, Damian Smedley, Timmy Strauss, Ondrej Vladyka, David Zocche, Sylvia Thun, Christopher J. Mungall, Melissa A. Haendel, Peter N. Robinson

PMC · DOI: 10.1016/j.ajhg.2025.12.001 · 2025-12-23

## TL;DR

A new software tool called GPSEA uses standardized data to find genotype-phenotype correlations in Mendelian diseases, improving clinical understanding.

## Contribution

GPSEA introduces a standardized approach using GA4GH Phenopacket Schema to enhance discovery of genotype-phenotype correlations.

## Key findings

- GPSEA identified 253 significant genotype-phenotype correlations across 85 cohorts.
- 48 cohorts showed at least one statistically significant genotype-phenotype correlation.
- Standardized data representations enable scalable discovery of genotype-phenotype correlations.

## Abstract

Comprehensively characterizing genotype-phenotype correlations (GPCs) in Mendelian disease would create new opportunities for improving clinical management and understanding disease biology. However, heterogeneous approaches to data sharing, reuse, and analysis have hindered progress in the field. We developed Genotype-Phenotype Statistical Evaluation of Associations (GPSEA), a software package that leverages the Global Alliance for Genomics and Health (GA4GH) Phenopacket Schema to represent case-level clinical and genetic data about individuals. GPSEA applies an independent filtering strategy to boost statistical power to detect categorical GPCs represented by Human Phenotype Ontology terms. GPSEA additionally enables visualization and analysis of continuous phenotypes, clinical severity scores, and survival data such as age of onset of disease or clinical manifestations. We applied GPSEA to 85 cohorts with 6,179 previously published individuals with variants in one of 81 genes associated with 122 Mendelian diseases and identified 253 significant GPCs, with 48 cohorts having at least one statistically significant GPC. These results highlight the power of standardized representations of clinical data for scalable discovery of GPCs in Mendelian disease.

GPSEA is a software tool that uses the GA4GH Phenopacket Schema to streamline discovery of genotype-phenotype correlations (GPCs) in Mendelian diseases. Analyzing data from 85 cohorts of previously published individuals, it identified 253 significant GPCs, demonstrating the power of standardized clinical data for improving clinical management and disease understanding.

## Full-text entities

- **Diseases:** Mendelian disease (MESH:D030342), Mendelian disorders (MESH:D025861)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12824607/full.md

---
Source: https://tomesphere.com/paper/PMC12824607