# Transcriptomic analysis and machine learning modeling identifies novel biomarkers and genetic characteristics of hypertrophic cardiomyopathy

**Authors:** Feng Zhang, Chunrui Li, Lulu Zhang

PMC · DOI: 10.3389/fgene.2025.1596049 · 2025-06-17

## TL;DR

This study uses RNA sequencing and machine learning to find new genetic markers and immune patterns in hypertrophic cardiomyopathy, leading to a 12-gene diagnostic signature.

## Contribution

The novel 12-gene diagnostic signature for HCM was developed using machine learning and transcriptomic analysis.

## Key findings

- Identified 271 differentially expressed genes enriched in key biological pathways.
- Discovered distinct immune cell infiltration patterns in HCM myocardial tissues.
- Developed a 12-gene diagnostic signature with strong predictive performance in multiple cohorts.

## Abstract

This study aimed to leverage bioinformatics approaches to identify novel biomarkers and characterize the molecular mechanisms underlying hypertrophic cardiomyopathy (HCM).

Two RNA-sequencing datasets (GSE230585 and GSE249925) were obtained from the Gene Expression Omnibus (GEO) repository. Computational analysis was performed to compare transcriptomic profiles between normal cardiac tissues from healthy donors and myocardial tissues from HCM patients. Functional annotation of differentially expressed genes (DEGs) was performed using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses. Immune cell infiltration patterns were quantified via single-sample gene set enrichment analysis (ssGSEA). A predictive model for HCM was developed through systematic evaluation of 113 combinations of 12 machine-learning algorithms, employing 10-fold cross-validation on training datasets and external validation using an independent cohort (GSE180313).

A total of 271 DEGs were identified, primarily enriched in multiple biological pathways. Immune infiltration analysis revealed distinct patterns of immune cell composition. Based on the top differentially expressed genes, a robust 12-gene diagnostic signature (COMP, SFRP4, RASD1, IL1RL1, S100A8, S100A9, ESM1, CA3, MYL1, VGLL2, MCEMP1, and MT1A) was constructed, demonstrating superior performance in both training and testing cohorts.

This study utilized bioinformatics approaches to analyze RNA-sequencing datasets, identifying DEGs and distinct immune infiltration patterns in HCM. These findings enabled the construction of a 12-gene diagnostic signature with robust predictive performance, thereby advancing our understanding of HCM’s molecular biomarkers and pathogenic mechanisms.

## Linked entities

- **Genes:** COMP (cartilage oligomeric matrix protein) [NCBI Gene 1311], SFRP4 (secreted frizzled related protein 4) [NCBI Gene 6424], RASD1 (ras related dexamethasone induced 1) [NCBI Gene 51655], IL1RL1 (interleukin 1 receptor like 1) [NCBI Gene 9173], S100A8 (S100 calcium binding protein A8) [NCBI Gene 6279], S100A9 (S100 calcium binding protein A9) [NCBI Gene 6280], ESM1 (endothelial cell specific molecule 1) [NCBI Gene 11082], CA3 (carbonic anhydrase 3) [NCBI Gene 761], MYL1 (myosin light chain 1) [NCBI Gene 4632], VGLL2 (vestigial like family member 2) [NCBI Gene 245806], MCEMP1 (mast cell expressed membrane protein 1) [NCBI Gene 199675], MT1A (metallothionein 1A) [NCBI Gene 4489]
- **Diseases:** hypertrophic cardiomyopathy (MONDO:0005045)

## Full-text entities

- **Genes:** IL1RL1 (interleukin 1 receptor like 1) [NCBI Gene 9173] {aka DER4, FIT-1, IL33R, ST2, ST2L, ST2V}, SFRP4 (secreted frizzled related protein 4) [NCBI Gene 6424] {aka FRP-4, FRPHE, FRZB-2, PYL, sFRP-4}, S100A9 (S100 calcium binding protein A9) [NCBI Gene 6280] {aka 60B8AG, CAGB, CFAG, CGLB, L1AG, LIAG}, MCEMP1 (mast cell expressed membrane protein 1) [NCBI Gene 199675] {aka C19orf59}, MYL1 (myosin light chain 1) [NCBI Gene 4632] {aka CMYO14, CMYP14, MLC-1, MLC1, MLC1/3, MLC1F}, CA3 (carbonic anhydrase 3) [NCBI Gene 761] {aka CAIII, Car3}, MT1A (metallothionein 1A) [NCBI Gene 4489] {aka MT-1A, MT-IA, MT1, MT1S, MTC}, VGLL2 (vestigial like family member 2) [NCBI Gene 245806] {aka VGL2, VITO1}, ESM1 (endothelial cell specific molecule 1) [NCBI Gene 11082] {aka endocan}, RASD1 (ras related dexamethasone induced 1) [NCBI Gene 51655] {aka AGS1, DEXRAS1, MGC:26290}, S100A8 (S100 calcium binding protein A8) [NCBI Gene 6279] {aka 60B8AG, CAGA, CFAG, CGLA, CP-10, L1Ag}
- **Diseases:** HCM (MESH:D002312)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12209200/full.md

---
Source: https://tomesphere.com/paper/PMC12209200