# Integrating machine learning and immune infiltration analysis to identify core genes and construct a diagnostic model for type 2 diabetes mellitus

**Authors:** Fangqin Cui, Li Li, Mingji Hu, Bao Li, Bang Du, Qingqing Fang, Dake Huang, Xiaonan Zhang

PMC · DOI: 10.3389/fendo.2026.1790356 · Frontiers in Endocrinology · 2026-03-10

## TL;DR

This study uses machine learning and immune analysis to find key genes and build a diagnostic model for type 2 diabetes, highlighting immune-related genes and potential treatments.

## Contribution

A novel hybrid machine learning model and immune infiltration analysis to identify T2DM-related genes and a diagnostic model with potential therapeutic implications.

## Key findings

- 393 differentially expressed genes were identified, mainly linked to immune functions and pathways.
- A LASSO+GBM model identified six hub genes, with BLVRB and NCF1 showing significant dysregulation and immune correlations.
- A logistic regression model using these genes achieved AUC > 0.75, and hsa-miR-127-5p and methylene blue were linked to BLVRB.

## Abstract

Type 2 diabetes mellitus (T2DM) is a prevalent metabolic disorder, and identifying robust biomarkers is crucial for improving diagnosis and understanding its pathogenesis.

We analyzed the gene expression dataset GSE250283 from the GEO database to identify differentially expressed genes (DEGs). Functional enrichment analyses (GO and KEGG) were performed. A comprehensive evaluation of 113 machine learning algorithm combinations was conducted to select an optimal model for hub gene identification and diagnostic prediction. The expression of key genes was validated using independent datasets and quantitative real-time PCR (qRT-PCR). Immune infiltration analysis, gene regulatory network prediction, and drug interaction analysis were also carried out.

A total of 393 DEGs were identified, primarily enriched in immune-related functions and pathways. The LASSO+GBM hybrid model demonstrated superior relative performance among the tested algorithms and pinpointed six hub genes: LY96, CCR1, BLVRB, TCF3, LILRA2, and NCF1. A logistic regression model based on these genes showed promising predictive accuracy (AUC > 0.75) in both training and testing sets. Validation confirmed that BLVRB and NCF1 were significantly dysregulated. Immune infiltration revealed significant alterations in the immune cell landscape of T2DM patients, with BLVRB and NCF1 showing substantial correlations with various immune cells. Regulatory network analysis suggested hsa-miR-127-5p as a potential upstream regulator of BLVRB, and methylene blue was identified as a potential targeting drug.

This study identifies novel immune-related candidate genes, particularly BLVRB and NCF1, for T2DM. The constructed diagnostic model shows potential for further development and the findings offer new insights into the immune mechanisms and potential therapeutic avenues for T2DM.

## Linked entities

- **Genes:** LY96 (lymphocyte antigen 96) [NCBI Gene 23643], CCR1 (C-C motif chemokine receptor 1) [NCBI Gene 1230], BLVRB (biliverdin reductase B) [NCBI Gene 645], TCF3 (transcription factor 3) [NCBI Gene 6929], LILRA2 (leukocyte immunoglobulin like receptor A2) [NCBI Gene 11027], NCF1 (neutrophil cytosolic factor 1) [NCBI Gene 653361]
- **Chemicals:** methylene blue (PubChem CID 4139)
- **Diseases:** Type 2 diabetes mellitus (MONDO:0005148), T2DM (MONDO:0005148)

## Full-text entities

- **Genes:** CCR1 (C-C motif chemokine receptor 1) [NCBI Gene 1230] {aka CD191, CKR-1, CKR1, CMKBR1, HM145, MIP1aR}, LY96 (lymphocyte antigen 96) [NCBI Gene 23643] {aka ESOP-1, MD-2, MD2, ly-96}, TCF3 (transcription factor 3) [NCBI Gene 6929] {aka AGM8, AGM8A, AGM8B, E2A, E47, ITF1}, LILRA2 (leukocyte immunoglobulin like receptor A2) [NCBI Gene 11027] {aka CD85H, ILT1, LIR-7, LIR7}, BLVRB (biliverdin reductase B) [NCBI Gene 645] {aka BVRB, FLR, HEL-S-10, SDR43U1}, MIR1275 (microRNA 1275) [NCBI Gene 100302123] {aka MIRN1275, hsa-mir-1275, mir-1275}, NCF1 (neutrophil cytosolic factor 1) [NCBI Gene 653361] {aka CGD1, NCF-1, NCF-47K, NCF1A, NOXO2, SH3PXD1A}
- **Diseases:** metabolic disorder (MESH:D008659), T2DM (MESH:D003924)
- **Chemicals:** methylene blue (MESH:D008751)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13008658/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13008658/full.md

## References

26 references — full list in the complete paper: https://tomesphere.com/paper/PMC13008658/full.md

---
Source: https://tomesphere.com/paper/PMC13008658