HistoViT: Vision Transformer for Accurate and Scalable Histopathological Cancer Diagnosis

Faisal Ahmed

arXiv:2508.11181·eess.IV·August 18, 2025

HistoViT: Vision Transformer for Accurate and Scalable Histopathological Cancer Diagnosis

Faisal Ahmed

PDF

TL;DR

HistoViT introduces a transformer-based deep learning model that significantly improves multi-class histopathological cancer diagnosis accuracy and scalability across various tissue types, outperforming traditional CNNs.

Contribution

This work presents a novel Vision Transformer framework tailored for histopathology, addressing limitations of CNNs and demonstrating superior performance on multiple cancer datasets.

Findings

01

Achieved over 99% accuracy on breast cancer dataset

02

Outperformed existing deep learning methods across all tested datasets

03

Demonstrated robustness and generalizability in digital pathology

Abstract

Accurate and scalable cancer diagnosis remains a critical challenge in modern pathology, particularly for malignancies such as breast, prostate, bone, and cervical, which exhibit complex histological variability. In this study, we propose a transformer-based deep learning framework for multi-class tumor classification in histopathological images. Leveraging a fine-tuned Vision Transformer (ViT) architecture, our method addresses key limitations of conventional convolutional neural networks, offering improved performance, reduced preprocessing requirements, and enhanced scalability across tissue types. To adapt the model for histopathological cancer images, we implement a streamlined preprocessing pipeline that converts tiled whole-slide images into PyTorch tensors and standardizes them through data normalization. This ensures compatibility with the ViT architecture and enhances both…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.