# Machine learning and single-cell RNA sequencing analyses identify MS-related monocytes and a five-gene candidate biomarker signature

**Authors:** Di Pan, Xinyi Wei, Xiyan Kuang, Dan Yang

PMC · DOI: 10.3389/fneur.2026.1739231 · Frontiers in Neurology · 2026-02-11

## TL;DR

This study uses machine learning and single-cell RNA sequencing to identify five genes linked to MS-related monocytes, which could help in diagnosing and understanding the disease.

## Contribution

The study introduces a five-gene biomarker signature for MS-related monocytes using machine learning and scRNA-seq data.

## Key findings

- Seven machine learning algorithms identified COX5A, CTSS, GBP2, IRF7, and PGAM1 as key biomarkers for MS-related monocytes.
- qRT-PCR and immunofluorescence validated the expression of these five genes.
- The gene signature provides insights into monocyte-driven immunopathology in MS.

## Abstract

Multiple sclerosis (MS) is a chronic autoimmune inflammatory disease of the central nervous system (CNS). Based on single-cell RNA sequencing (scRNA-seq) data from experimental autoimmune encephalomyelitis (EAE), this study applied machine learning algorithms combined with integrative bioinformatics methods to identify pivotal biomarkers associated with MS-related monocytes.

Machine learning and scRNA-seq analyses were performed to characterize MS-related monocytes, leading to the identification of five optimally characterized candidate biomarkers associated with pathogenic alterations. The performance of multiple algorithms, such as logistic regression (LogReg), latent Dirichlet allocation (LDA), support vector machine (SVM), Naive Bayes (NB), k-nearest neighbor (KNN), Rpart, and random forest (RF), was evaluated. In addition, the CIBERSORT, single-sample gene set enrichment analysis (ssGSEA), and GSEA algorithms were employed to investigate and define immunological features and biological functions. Finally, quantitative real-time polymerase chain reaction (qRT-PCR) and immunofluorescence were used to validate the expression of the identified genes.

Seven machine learning algorithms consistently validated five key genes (COX5A, CTSS, GBP2, IRF7, and PGAM1) as optimally characterized biomarkers. The infiltration profiles of these genes, together with associated immune cell types, provide potential biological underpinnings for the pathogenic alterations observed in MS.

Collectively, these findings indicate that COX5A, CTSS, GBP2, IRF7, and PGAM1 represent promising biomarkers for MS. The identified gene signature may improve MS diagnosis and risk stratification and provide new insights into monocyte-driven immunopathology.

## Linked entities

- **Genes:** COX5A (cytochrome c oxidase subunit 5A) [NCBI Gene 9377], CTSS (cathepsin S) [NCBI Gene 1520], GBP2 (guanylate binding protein 2) [NCBI Gene 2634], IRF7 (interferon regulatory factor 7) [NCBI Gene 3665], PGAM1 (phosphoglycerate mutase 1) [NCBI Gene 5223]
- **Diseases:** Multiple sclerosis (MONDO:0005301), experimental autoimmune encephalomyelitis (MONDO:0005134)

## Full-text entities

- **Genes:** Nampt (nicotinamide phosphoribosyltransferase) [NCBI Gene 59027] {aka 1110035O14Rik, NAmPRTase, Pbef, Pbef1, Visfatin}, Ccl2 (C-C motif chemokine ligand 2) [NCBI Gene 20296] {aka HC11, JE, MCAF, MCP-1, MCP1, SMC-CF}, LDHA (lactate dehydrogenase A) [NCBI Gene 3939] {aka GSD11, HEL-S-133P, LDHM, PIG19}, COX5A (cytochrome c oxidase subunit 5A) [NCBI Gene 9377] {aka COX, COX-VA, MC4DN20, VA}, PPA1 (inorganic pyrophosphatase 1) [NCBI Gene 5464] {aka HEL-S-66p, IOPPP, PP, PP1, SID6-8061}, Il1b (interleukin 1 beta) [NCBI Gene 16176] {aka IL-1beta, Il-1b}, Jun (Jun proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 16476] {aka AP-1, Junc, c-jun}, Ccr5 (C-C motif chemokine receptor 5) [NCBI Gene 12774] {aka AM4-7, CD195, Cmkbr5}, Ccl5 (C-C motif chemokine ligand 5) [NCBI Gene 20304] {aka MuRantes, RANTES, SISd, Scya5, TCP228}, Tnf (tumor necrosis factor) [NCBI Gene 21926] {aka DIF, TNF-a, TNF-alpha, TNFSF2, TNFalpha, Tnfa}, Cebpb (CCAAT/enhancer binding protein beta) [NCBI Gene 12608] {aka C/EBPbeta, CRP2, IL-6DBP, LAP, LIP, NF-IL6}, Cox5a (cytochrome c oxidase subunit 5A) [NCBI Gene 12858] {aka CcOX}, CCR2 (C-C motif chemokine receptor 2) [NCBI Gene 729230] {aka CC-CKR-2, CCR-2, CCR2A, CCR2B, CD192, CKR2}, Nfkb1 (nuclear factor of kappa light polypeptide gene enhancer in B cells 1, p105) [NCBI Gene 18033] {aka NF-KB1, NF-kappaB, NF-kappaB1, p105, p50, p50/p105}, PGAM1 (phosphoglycerate mutase 1) [NCBI Gene 5223] {aka HEL-S-35, PGAM-B, PGAMA}, Tgfb1 (transforming growth factor, beta 1) [NCBI Gene 21803] {aka TGF-beta1, TGFbeta1, Tgfb, Tgfb-1}, IL6 (interleukin 6) [NCBI Gene 3569] {aka BSF-2, BSF2, CDF, HGF, HSF, IFN-beta-2}, Osm (oncostatin M) [NCBI Gene 18413] {aka OncoM}, Vegfa (vascular endothelial growth factor A) [NCBI Gene 22339] {aka L-VEGF, Vegf, Vpf}, Nlrp3 (NLR family, pyrin domain containing 3) [NCBI Gene 216799] {aka AGTAVPRL, AII/AVP, Cias1, FCAS, FCU, MWS}, IRF7 (interferon regulatory factor 7) [NCBI Gene 3665] {aka IMD39, IRF-7, IRF-7H, IRF7A, IRF7B, IRF7C}, Ctss (cathepsin S) [NCBI Gene 13040] {aka Cats}, FCGR3A (Fc gamma receptor IIIa) [NCBI Gene 2214] {aka CD16-II, CD16A, FCG3, FCGR3, FCRIIIA, FcGRIIIA}, GBP2 (guanylate binding protein 2) [NCBI Gene 2634], RAN (RAN, member RAS oncogene family) [NCBI Gene 5901] {aka ARA24, Gsp1, TC4}, YWHAE (tyrosine 3-monooxygenase/tryptophan 5-monooxygenase activation protein epsilon) [NCBI Gene 7531] {aka 14-3-3E, HEL2, KCIP-1, MDCR, MDS}, Ccr2 (C-C motif chemokine receptor 2) [NCBI Gene 12772] {aka Cc-ckr-2, Ccr2a, Ccr2b, Ckr2, Ckr2a, Ckr2b}, Irf7 (interferon regulatory factor 7) [NCBI Gene 54123], CTSS (cathepsin S) [NCBI Gene 1520], Anxa11os (annexin A11, opposite strand) [NCBI Gene 105245705] {aka Gm9872}, Stat1 (signal transducer and activator of transcription 1) [NCBI Gene 20846] {aka 2010005J02Rik}, CD14 (CD14 molecule) [NCBI Gene 929], ITGAM (integrin subunit alpha M) [NCBI Gene 3684] {aka CD11B, CR3A, HNA-4, MAC-1, MAC1A, MO1A}, Itpr3 (inositol 1,4,5-triphosphate receptor 3) [NCBI Gene 16440] {aka IP3R 3, IP3R-3, Ip3r3, Itpr-3, tf}, RPL11 (ribosomal protein L11) [NCBI Gene 6135] {aka DBA7, GIG34, L11, uL5}, F11r (F11 receptor) [NCBI Gene 16456] {aka 9130004G24, ESTM33, JAM, JAM-1, JAM-A, Jcam}, MIR155 (microRNA 155) [NCBI Gene 406947] {aka MIRN155, miRNA155, mir-155}, Ifng (interferon gamma) [NCBI Gene 15978] {aka IFN-g, If2f, Ifg}, Irf5 (interferon regulatory factor 5) [NCBI Gene 27056] {aka mirf5}, Ccr1 (C-C motif chemokine receptor 1) [NCBI Gene 12768] {aka Cmkbr1, Mip-1a-R}, Gbp2 (guanylate binding protein 2) [NCBI Gene 14469], CD80 (CD80 molecule) [NCBI Gene 941] {aka B7, B7-1, B7.1, BB1, CD28LG, CD28LG1}, PSMA4 (proteasome 20S subunit alpha 4) [NCBI Gene 5685] {aka HC9, HsT17706, PSC9}, Stat3 (signal transducer and activator of transcription 3) [NCBI Gene 20848] {aka 1110034C02Rik, Aprf}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, Actb (actin, beta) [NCBI Gene 11461] {aka Actx, E430023M04Rik, beta-actin}, Junb (jun B proto-oncogene, AP-1 transcription factor subunit) [NCBI Gene 16477], GPNMB (glycoprotein nmb) [NCBI Gene 10457] {aka HGFIN, NMB, PLCA3}, Stat5a (signal transducer and activator of transcription 5A) [NCBI Gene 20850] {aka STAT5}, IL12B (interleukin 12B) [NCBI Gene 3593] {aka CLMF, CLMF2, IL-12B, IMD28, IMD29, NKSF}, Ifna (interferon alpha complex region) [NCBI Gene 111654] {aka Ifa, Ifa8}, Mef2a (myocyte enhancer factor 2A) [NCBI Gene 17258] {aka A430079H05Rik}, MOG (myelin oligodendrocyte glycoprotein) [NCBI Gene 4340] {aka BTN6, BTNL11, MOGIG2, NRCLP7}, Trp53-ps (transformation related protein 53, pseudogene) [NCBI Gene 22060], Pgam1 (phosphoglycerate mutase 1) [NCBI Gene 18648] {aka 2310050F24Rik, Pgam-1}, Il6 (interleukin 6) [NCBI Gene 16193] {aka Il-6}, Myc (Myc proto-oncogene, bHLH transcription factor) [NCBI Gene 17869] {aka Myc2, Niard, Nird, bHLHe39}, Cd4 (CD4 antigen) [NCBI Gene 12504] {aka L3T4, Ly-4}, Ly6c1 (lymphocyte antigen 6 family member C1) [NCBI Gene 17067] {aka Ly-6C, Ly-6C1, Ly6c}, Mif (macrophage migration inhibitory factor (glycosylation-inhibiting factor)) [NCBI Gene 17319] {aka DER6, GIF, Glif}
- **Diseases:** brain disease (MESH:D001927), transverse myelitis (MESH:D009188), physical disability (MESH:D059445), Demyelinating lesions (MESH:D003711), death (MESH:D003643), diplopia (MESH:D004172), immunodeficiency (MESH:D007153), Disease (MESH:D004194), neurodegeneration (MESH:D019636), multifocal leukoencephalopathy (MESH:D007968), cerebellar ataxia (MESH:D002524), inflammation (MESH:D007249), legionellosis (MESH:D007876), BBB damage (MESH:C536830), CNS (MESH:D002493), graft-versus-host disease (MESH:D006086), neuroinflammation (MESH:D000090862), brainstem dysfunction (MESH:D020295), inflammatory damage (MESH:D018746), cardiovascular disease (MESH:D002318), sensory deficits or impairments (MESH:D012678), weakness (MESH:D018908), leishmaniasis (MESH:D007896), musculoskeletal system cancers (MESH:D009369), Paralysis (MESH:D010243), type I diabetes mellitus (MESH:D003922), immune-mediated diseases (MESH:C567355), autoimmune inflammatory disease (MESH:D001327), neurological disease (MESH:D020271), EAE (MESH:D004681), systemic lupus erythematosus (MESH:D008180), axonal damage (MESH:D001480), immune dysregulation (OMIM:614878), neurological symptoms (MESH:D009461), MS (MESH:D009103)
- **Chemicals:** glycosaminoglycan (MESH:D006025), Alexa Fluor 594 (-), heparan sulfate (MESH:D006497), paraffin (MESH:D010232), linoleic acid (MESH:D019787), chondroitin sulfate (MESH:D002809), nicotine (MESH:D009538), Alexa Fluor 488 (MESH:C000711379), fatty acid (MESH:D005227), paraformaldehyde (MESH:C003043), isoflurane (MESH:D007530), TRIzol (MESH:C411644), endocannabinoid (MESH:D063388), natalizumab (MESH:D000069442), water (MESH:D014867), ATP (MESH:D000255), alpha-linolenic acid (MESH:D017962), sc (MESH:D012538), DAPI (MESH:C007293), short-chain fatty acids (MESH:D005232), calcium (MESH:D002118), dermatan sulfate (MESH:D003871), PBS (MESH:D007854), heparin (MESH:D006493)
- **Species:** Homo sapiens (human, species) [taxon 9606], Rattus norvegicus (brown rat, species) [taxon 10116], Mus musculus (house mouse, species) [taxon 10090]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12932200/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12932200/full.md

## References

42 references — full list in the complete paper: https://tomesphere.com/paper/PMC12932200/full.md

---
Source: https://tomesphere.com/paper/PMC12932200