Multi-Scale Hybrid Vision Transformer for Learning Gastric Histology: AI-Based Decision Support System for Gastric Cancer Treatment
Yujin Oh, Go Eun Bae, Kyung-Hee Kim, Min-Kyung Yeo, Jong Chul Ye

TL;DR
This paper introduces a multi-scale hybrid Vision Transformer AI system that accurately classifies gastric cancer subtypes from histology slides, improving diagnostic sensitivity and efficiency in clinical settings.
Contribution
It presents a novel hybrid ViT model enabling fine-grained gastric cancer subclassification aligned with treatment guidance, enhancing AI-assisted pathology.
Findings
Achieved class-average sensitivity above 0.85 on 1,212 slides.
AI-assisted pathologists improved sensitivity by 12%.
Screening time reduced by 18%.
Abstract
Gastric endoscopic screening is an effective way to decide appropriate gastric cancer (GC) treatment at an early stage, reducing GC-associated mortality rate. Although artificial intelligence (AI) has brought a great promise to assist pathologist to screen digitalized whole slide images, existing AI systems are limited in fine-grained cancer subclassifications and have little usability in planning cancer treatment. We propose a practical AI system that enables five subclassifications of GC pathology, which can be directly matched to general GC treatment guidance. The AI system is designed to efficiently differentiate multi-classes of GC through multi-scale self-attention mechanism using 2-stage hybrid Vision Transformer (ViT) networks, by mimicking the way how human pathologists understand histology. The AI system demonstrates reliable diagnostic performance by achieving class-average…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColorectal Cancer Screening and Detection · AI in cancer detection · Gastric Cancer Management and Outcomes
MethodsAttention Is All You Need · Linear Layer · Softmax · Dense Connections · Layer Normalization · Multi-Head Attention · Byte Pair Encoding · Absolute Position Encodings · Label Smoothing · Position-Wise Feed-Forward Layer
