# IC2Bert: masked gene expression pretraining and supervised fine tuning for robust immune checkpoint blockade (ICB) response prediction

**Authors:** Seongyong Park, Seonkyu Kim, Peng Jiang

PMC · DOI: 10.1038/s41598-025-14166-x · Scientific Reports · 2025-08-01

## TL;DR

This paper introduces IC2Bert, a new model that improves the prediction of immune checkpoint blockade treatment responses using RNA-seq data across diverse patient groups.

## Contribution

The novel IC2Bert model combines masked gene expression pretraining with supervised fine-tuning to address cohort heterogeneity in ICB response prediction.

## Key findings

- IC2Bert outperforms existing methods in predictive accuracy and robustness across diverse RNA-seq datasets.
- The model's performance was validated using Leave-One-Dataset-Out Cross-Validation (LODOCV).

## Abstract

Bulk RNA-seq-based prediction of immune checkpoint blockade (ICB) responses has been extensively studied to distinguish responders from non-responders. However, cohort heterogeneity remains a major challenge, hindering the robustness and generalizability of predictive models across diverse RNA-seq datasets. In this study, we present IC2Bert, a novel model that employs masked gene expression pretraining combined with domain-specific supervised fine-tuning to enhance predictive robustness across heterogeneous ICB response cohorts. To ensure an objective evaluation, we assessed the model’s performance using a Leave-One-Dataset-Out Cross-Validation (LODOCV) approach. IC2Bert demonstrated significantly improved predictive accuracy and robustness compared to existing methods, effectively addressing the challenges posed by cohort heterogeneity. The IC2Bert model and its source code are publicly available on GitHub: https://github.com/data2intelligence/ic2bert.

## Full-text entities

- **Genes:** CTLA4 (cytotoxic T-lymphocyte associated protein 4) [NCBI Gene 1493] {aka ALPS5, CD, CD152, CELIAC3, CTLA-4, GRD4}, CD19 (CD19 molecule) [NCBI Gene 930] {aka B4, CVID3}, MS4A1 (membrane spanning 4-domains A1) [NCBI Gene 931] {aka B1, Bp35, CD20, CVID5, FMC7, LEU-16}, RRM2 (ribonucleotide reductase regulatory subunit M2) [NCBI Gene 6241] {aka C2orf48, R2, RR2, RR2M}, CD4 (CD4 molecule) [NCBI Gene 920] {aka CD4mut, IMD79, Leu-3, OKT4D, T4}, CDH1 (cadherin 1) [NCBI Gene 999] {aka Arc-1, BCDS1, CD324, CDHE, ECAD, LCAM}, CD8A (CD8 subunit alpha) [NCBI Gene 925] {aka CD8, CD8alpha, IMD116, Leu2, p32}, MKI67 (marker of proliferation Ki-67) [NCBI Gene 4288] {aka KIA, MIB-, MIB-1, PPP1R105}, PDCD1 (programmed cell death 1) [NCBI Gene 5133] {aka ADMIO4, AIMTBS, CD279, PD-1, PD1, SLEB2}, PRC1 (protein regulator of cytokinesis 1) [NCBI Gene 9055] {aka ASE1, MAP65}, CD274 (CD274 molecule) [NCBI Gene 29126] {aka ADMIO5, B7-H, B7H1, PD-L1, PDCD1L1, PDCD1LG1}, CKS1B (CDC28 protein kinase regulatory subunit 1B) [NCBI Gene 1163] {aka CKS1, PNAS-16, PNAS-18, ckshs1}, FCRL5 (Fc receptor like 5) [NCBI Gene 83416] {aka BXMAS1, CD307, CD307e, FCRH5, IRTA2, PRO820}, POU2AF1 (POU class 2 homeobox associating factor 1) [NCBI Gene 5450] {aka BOB1, OBF-1, OBF1, OCAB}, NUSAP1 (nucleolar and spindle associated protein 1) [NCBI Gene 51203] {aka ANKT, BM037, LNP, NUSAP, PRO0310p1, Q0310}, TOP2A (DNA topoisomerase II alpha) [NCBI Gene 7153] {aka TOP2, TOP2alpha, TOPIIA, TP2A}, CD79A (CD79a molecule) [NCBI Gene 973] {aka IGA, IGAlpha, MB-1, MB1}, TGFB1 (transforming growth factor beta 1) [NCBI Gene 7040] {aka CAEND1, CED, DPD1, IBDIMDE, LAP, TGF-beta1}
- **Diseases:** renal cell carcinoma (MESH:D002292), ICB (MESH:D007154), Metastatic melanoma (MESH:D008545), non-small cell lung cancer (MESH:D002289), colorectal cancer (MESH:D015179), Cancer (MESH:D009369), Tumor Inflammation (MESH:D007249)
- **Chemicals:** pembrolizumab (MESH:C582435), bevacizumab (MESH:D000068258), Atezo (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12313928/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12313928/full.md

## References

14 references — full list in the complete paper: https://tomesphere.com/paper/PMC12313928/full.md

---
Source: https://tomesphere.com/paper/PMC12313928