# Integrating traditional omics and AI-driven approaches for discovery and validation of novel MicroRNA biomarkers and therapeutic targets in thyroid cancer

**Authors:** Yi Wan, Dan Xie, Min Zhang, Shiyu Yang, Zhantian Zhang, Xiaomin Fu, Meiling Wang, Yongfu Zhao

PMC · DOI: 10.3389/fphar.2025.1727032 · 2026-01-28

## TL;DR

This study combines traditional omics and AI to discover a new microRNA biomarker and drug target for thyroid cancer.

## Contribution

The novel contribution is an integrated framework combining bulk transcriptomics, AI-driven biomarker selection, and single-cell validation to identify a new miRNA therapeutic target.

## Key findings

- A four-gene diagnostic panel (BID, MIR6756, ITM2A, TGM2) achieved high diagnostic accuracy with AUC values of 1.0 and 0.99.
- hsa-miR-6756-5p was identified as a tumor-specific oncogenic microRNA promoting cancer progression in vitro and in vivo.
- Single-cell analysis revealed distinct immune infiltration patterns and cell-type-specific biomarker expression in thyroid cancer.

## Abstract

The discovery of reliable biomarkers and therapeutic targets remains a critical challenge in thyroid cancer management. This study demonstrates the value of integrating traditional omics technologies with artificial intelligence approaches and single-cell validation to identify novel microRNA-based biomarkers and drug targets. We hypothesized that combining meta-analysis of bulk transcriptomics, machine learning-driven feature selection, and single-cell spatial mapping would enhance biomarker discovery and validation compared to using either approach independently.

We employed a hybrid strategy integrating traditional transcriptomic analysis with AI-driven methods. Meta-analysis of three bulk RNA-seq datasets (GSE65144, GSE33630, GSE50901) was performed using effect size analysis, followed by machine learning-based forward feature selection to identify optimal biomarker combinations. Single-cell RNA-seq data (GSE184362, 196,145 cells from 23 thyroid cancer samples) provided cell-type-specific validation and immune microenvironment profiling. Comprehensive experimental validation was conducted using TPC-1 and BHT101 cell lines through miR-6756-5p overexpression and CRISPRi-mediated knockdown, including functional assays and xenograft experiments to establish therapeutic potential.

The AI-enhanced meta-analysis identified a four-gene diagnostic panel (BID, MIR6756, ITM2A, TGM2) achieving exceptional performance with AUC values of 1.0 and 0.99 in training sets and 0.74 in independent validation. Single-cell analysis of 50,000 cells revealed six major cell types with significant immune infiltration (61.9%), providing crucial cell-type specificity for the identified biomarkers. BID and ITM2A showed predominantly epithelial expression, while TGM2 was enriched in immune and stromal compartments, demonstrating multi-cellular biomarker patterns. Immune microenvironment analysis revealed distinct CD8+/CD4+ T cell ratios between metastatic and non-metastatic samples. hsa-miR-6756-5p, identified through this integrated approach, exhibited tumor-specific expression and demonstrated oncogenic properties by promoting proliferation, colony formation, migration, and invasion in vitro, while enhancing tumor growth in vivo, validating it as a novel therapeutic target.

Our study exemplifies the synergistic value of integrating traditional omics approaches with AI-driven analytics for biomarker and drug target discovery. The combination of machine learning-based feature selection from bulk transcriptomics with single-cell spatial validation addresses limitations of each approach used independently. This integrated framework successfully identified has-miR-6756-5p as both a diagnostic biomarker and therapeutic target, demonstrating how traditional experimental validation coupled with computational prediction enhances translational potential. The multi-scale approach spanning bulk transcriptomics, AI-driven biomarker selection, single-cell characterization, and functional validation represents an effective paradigm for developing clinically relevant cancer biomarkers and therapeutic targets.

## Linked entities

- **Genes:** BID (BH3 interacting domain death agonist) [NCBI Gene 637], MIR6756 (microRNA 6756) [NCBI Gene 102465453], ITM2A (integral membrane protein 2A) [NCBI Gene 9452], TGM2 (transglutaminase 2) [NCBI Gene 7052]
- **Diseases:** thyroid cancer (MONDO:0002108)

## Full-text entities

- **Genes:** CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, ITM2A (integral membrane protein 2A) [NCBI Gene 9452] {aka BRICD2A, E25A}, MIR6756 (microRNA 6756) [NCBI Gene 102465453] {aka hsa-mir-6756}, TGM2 (transglutaminase 2) [NCBI Gene 7052] {aka G(h), TG(C), TGC, hTG2, tTG}, BID (BH3 interacting domain death agonist) [NCBI Gene 637] {aka FP497}
- **Diseases:** thyroid cancer (MESH:D013964), cancer (MESH:D009369)

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12891075/full.md

---
Source: https://tomesphere.com/paper/PMC12891075