Residual Vision Transformer (ResViT) Based Self-Supervised Learning Model for Brain Tumor Classification
Meryem Altin Karagoz, O. Ufuk Nalbantoglu, and Geoffrey C. Fox

TL;DR
This paper introduces a self-supervised learning model based on Residual Vision Transformer for brain tumor classification, effectively addressing limited dataset issues and achieving high accuracy on multiple MRI datasets.
Contribution
The paper proposes a novel hybrid CNN-transformer SSL model pretraining on MRI synthesis, improving brain tumor classification accuracy over existing models.
Findings
Achieves up to 98.53% accuracy on Figshare dataset.
Pretraining on MRI data outperforms ImageNet pretraining.
Effective handling of limited data with synthetic image augmentation.
Abstract
Deep learning has proven very promising for interpreting MRI in brain tumor diagnosis. However, deep learning models suffer from a scarcity of brain MRI datasets for effective training. Self-supervised learning (SSL) models provide data-efficient and remarkable solutions to limited dataset problems. Therefore, this paper introduces a generative SSL model for brain tumor classification in two stages. The first stage is designed to pre-train a Residual Vision Transformer (ResViT) model for MRI synthesis as a pretext task. The second stage includes fine-tuning a ResViT-based classifier model as a downstream task. Accordingly, we aim to leverage local features via CNN and global features via ViT, employing a hybrid CNN-transformer architecture for ResViT in pretext and downstream tasks. Moreover, synthetic MRI images are utilized to balance the training set. The proposed model performs on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBrain Tumor Detection and Classification
MethodsAttention Is All You Need · Dense Connections · Label Smoothing · Adam · Residual Connection · Byte Pair Encoding · Parallel GAN · Linear Layer · Softmax · Position-Wise Feed-Forward Layer
