Predicting microsatellite instability and key biomarkers in colorectal cancer from H&E-stained images: Achieving SOTA predictive performance with fewer data using Swin Transformer
Bangwei Guo, Xingyu Li, Jitendra Jonnagaddala, Hong Zhang, Xu Steven, Xu

TL;DR
This study introduces a Swin Transformer-based AI workflow that accurately predicts biomarkers like MSI in colorectal cancer from H&E images, requiring fewer data and outperforming existing models.
Contribution
The paper presents a novel Swin Transformer approach that achieves state-of-the-art biomarker prediction in CRC with significantly less training data.
Findings
Achieved SOTA AUROC of 0.90 for MSI prediction.
Outperformed existing models in cross-validation and external validation.
Efficiently used only 200-500 training samples, 5-10 times less data than previous methods.
Abstract
Artificial intelligence (AI) models have been developed for predicting clinically relevant biomarkers, including microsatellite instability (MSI), for colorectal cancers (CRC). However, the current deep-learning networks are data-hungry and require large training datasets, which are often lacking in the medical domain. In this study, based on the latest Hierarchical Vision Transformer using Shifted Windows (Swin-T), we developed an efficient workflow for biomarkers in CRC (MSI, hypermutation, chromosomal instability, CpG island methylator phenotype, BRAF, and TP53 mutation) that only required relatively small datasets, but achieved the state-of-the-art (SOTA) predictive performance. Our Swin-T workflow not only substantially outperformed published models in an intra-study cross-validation experiment using TCGA-CRC-DX dataset (N = 462), but also showed excellent generalizability in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic factors in colorectal cancer · Cancer Genomics and Diagnostics · Colorectal Cancer Screening and Detection
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · 1x1 Convolution · Grouped Convolution · Groupwise Point Convolution · Batch Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Pointwise Convolution · Convolution
