Medical Image Classification with KAN-Integrated Transformers and Dilated Neighborhood Attention
Omid Nejati Manzari, Hojat Asgariandehkordi, Taha Koleilat, Yiming Xiao, Hassan Rivaz

TL;DR
MedViTV2 is a novel medical image classification model that integrates KAN layers and Dilated Neighborhood Attention to improve accuracy and efficiency, especially on corrupted data, outperforming previous methods across multiple datasets.
Contribution
Introduces MedViTV2, combining KAN and DiNA with hierarchical hybrid strategy for robust, efficient medical image classification, addressing real-world data corruption challenges.
Findings
Achieved state-of-the-art results on 27 out of 29 datasets.
Reduced computational complexity by 44%.
Improved accuracy by up to 13.4% on benchmarks.
Abstract
Convolutional networks, transformers, hybrid models, and Mamba-based architectures have demonstrated strong performance across various medical image classification tasks. However, these methods were primarily designed to classify clean images using labeled data. In contrast, real-world clinical data often involve image corruptions that are unique to multi-center studies and stem from variations in imaging equipment across manufacturers. In this paper, we introduce the Medical Vision Transformer (MedViTV2), a novel architecture incorporating Kolmogorov-Arnold Network (KAN) layers into the transformer architecture for the first time, aiming for generalized medical image classification. We have developed an efficient KAN block to reduce computational load while enhancing the accuracy of the original MedViT. Additionally, to counteract the fragility of our MedViT when scaled up, we propose…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification · Medical Imaging and Analysis · AI in cancer detection
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Layer Normalization · Residual Connection · Dense Connections · Multi-Head Attention · Position-Wise Feed-Forward Layer · Adam · Softmax
