# Integrative Bioinformatics and Machine Learning Identify Novel Diagnostic Biomarkers and Molecular Mechanisms in Sjögren’s Syndrome

**Authors:** Hua Xu, Yong Liu, Yuyin Song, Yifan Zheng, Haifeng Jing, Yanfei Gao, Depeng Zhou, Xiang Chi, Jia Chen

PMC · DOI: 10.1155/ijog/5044551 · 2026-01-16

## TL;DR

This study uses bioinformatics and machine learning to find new biomarkers and understand the molecular basis of Sjögren’s Syndrome, a difficult-to-diagnose autoimmune disease.

## Contribution

The study introduces 12 novel diagnostic biomarkers for Sjögren’s Syndrome and reveals their molecular mechanisms and immune-related functions.

## Key findings

- Twelve hub genes (e.g., EPSTI1, IFIH1) were identified with strong diagnostic performance across multiple datasets.
- Immune infiltration analysis showed immune dysregulation in SS patients, including reduced CD8+ T cells and Tregs.
- Drug repurposing suggested FDA-approved drugs like nisoldipine and exemestane as potential treatments.

## Abstract

Sjögren’s syndrome (SS) is a chronic autoimmune disorder characterized by significant diagnostic challenges due to nonspecific symptoms and a lack of reliable biomarkers, often resulting in delayed diagnosis and suboptimal patient management.

This study is aimed at identifying novel diagnostic biomarkers and elucidating the molecular mechanisms underlying SS pathogenesis through integrative bioinformatics and machine learning approaches.

We analyzed three peripheral blood transcriptomic datasets (GSE51092, GSE66795, and GSE84844) comprising a total of 351 SS patients and 91 healthy controls. Differential expression analysis, weighted gene coexpression network analysis (WGCNA), and 12 machine learning algorithms were employed to identify robust diagnostic biomarkers. Immune cell infiltration was assessed using CIBERSORT, and single‐cell RNA sequencing data (GSE157278) were analyzed to validate cell‐type‐specific expression patterns. Drug repurposing analysis was conducted using the L1000FWD platform.

We identified 12 hub genes (EPSTI1, IFIH1, CXCL10, TNFSF10, GBP5, PARP9, IFI44, LAP3, IFIT2, IFI44L, PARP12, and OAS1) with exceptional diagnostic performance (AUC = 0.994 in training, 0.838 in internal validation, and 0.825 in external validation). These biomarkers showed significant correlations with clinical indicators including ANA, Ro/SSA, and La/SSB (p < 0.05). Immune‐infiltration analysis revealed pronounced immune dysregulation in SS patients, characterized by an imbalance between naive and memory B cells and reduced CD8+ T cells and regulatory T cells (Tregs). Single‐cell transcriptomics confirmed predominant expression in monocytes and dendritic cells, with additional significant expression in B cells and CD4+ T cells. Virtual knockdown analysis implicated these genes in antigen presentation, interferon signaling, and leukocyte trafficking. Drug repurposing identified FDA‐approved candidates such as nisoldipine and exemestane as potential therapeutics.

Our integrative approach identifies 12 robust diagnostic biomarkers for SS, offering new insights into disease mechanisms and highlighting potential therapeutic targets for this challenging autoimmune disorder.

## Linked entities

- **Genes:** EPSTI1 (epithelial stromal interaction 1) [NCBI Gene 94240], IFIH1 (interferon induced with helicase C domain 1) [NCBI Gene 64135], CXCL10 (C-X-C motif chemokine ligand 10) [NCBI Gene 3627], TNFSF10 (TNF superfamily member 10) [NCBI Gene 8743], GBP5 (guanylate binding protein 5) [NCBI Gene 115362], PARP9 (poly(ADP-ribose) polymerase family member 9) [NCBI Gene 83666], IFI44 (interferon induced protein 44) [NCBI Gene 10561], LAP3 (leucine aminopeptidase 3) [NCBI Gene 51056], IFIT2 (interferon induced protein with tetratricopeptide repeats 2) [NCBI Gene 3433], IFI44L (interferon induced protein 44 like) [NCBI Gene 10964], PARP12 (poly(ADP-ribose) polymerase family member 12) [NCBI Gene 64761], OAS1 (2'-5'-oligoadenylate synthetase 1) [NCBI Gene 4938]
- **Chemicals:** nisoldipine (PubChem CID 4499), exemestane (PubChem CID 60198)

## Full-text entities

- **Genes:** CXCL10 (C-X-C motif chemokine ligand 10) [NCBI Gene 3627] {aka C7, IFI10, INP10, IP-10, SCYB10, crg-2}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, IFI44 (interferon induced protein 44) [NCBI Gene 10561] {aka MTAP44, TLDC5, p44}, IFI44L (interferon induced protein 44 like) [NCBI Gene 10964] {aka C1orf29, GS3686, TLDC5B}, EPSTI1 (epithelial stromal interaction 1) [NCBI Gene 94240] {aka BRESI1}, PARP9 (poly(ADP-ribose) polymerase family member 9) [NCBI Gene 83666] {aka ARTD9, BAL, BAL1, MGC:7868}, IFIH1 (interferon induced with helicase C domain 1) [NCBI Gene 64135] {aka AGS7, Hlcd, IDDM19, IMD95, MDA-5, MDA5}, IFIT2 (interferon induced protein with tetratricopeptide repeats 2) [NCBI Gene 3433] {aka G10P2, GARG-39, IFI-54, IFI-54K, IFI54, IFIT-2}, LAP3 (leucine aminopeptidase 3) [NCBI Gene 51056] {aka HEL-S-106, LAP, LAPEP, PEPS}, GBP5 (guanylate binding protein 5) [NCBI Gene 115362] {aka GBP-5}, PARP12 (poly(ADP-ribose) polymerase family member 12) [NCBI Gene 64761] {aka ARTD12, MST109, MSTP109, ZC3H1, ZC3HDC1}, SSB (small RNA binding exonuclease protection factor La) [NCBI Gene 6741] {aka LARP3, La, La/SSB, SSB/La}, TNFSF10 (TNF superfamily member 10) [NCBI Gene 8743] {aka APO2L, Apo-2L, CD253, TANCR, TL2, TNLG6A}, OAS1 (2'-5'-oligoadenylate synthetase 1) [NCBI Gene 4938] {aka E18/E16, IFI-4, IMD100, OIAS, OIASI}
- **Diseases:** SS (MESH:D012859), autoimmune disorder (MESH:D001327), immune dysregulation (OMIM:614878)
- **Chemicals:** nisoldipine (MESH:D015737), exemestane (MESH:C056516)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

28 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12811409/full.md

---
Source: https://tomesphere.com/paper/PMC12811409