A New Perspective to Boost Vision Transformer for Medical Image Classification
Yuexiang Li, Yawen Huang, Nanjun He, Kai Ma, Yefeng Zheng

TL;DR
This paper introduces BOLT, a self-supervised learning method for medical image classification using Transformers, which effectively learns from limited data without relying on large-scale labeled datasets like ImageNet.
Contribution
The paper proposes BOLT, a novel self-supervised approach with an auxiliary difficulty ranking task to enhance Transformer training for medical images without large-scale pretraining.
Findings
BOLT outperforms ImageNet pretrained weights in medical image tasks.
BOLT surpasses existing self-supervised methods in accuracy.
Effective learning from limited medical data with Transformer backbone.
Abstract
Transformer has achieved impressive successes for various computer vision tasks. However, most of existing studies require to pretrain the Transformer backbone on a large-scale labeled dataset (e.g., ImageNet) for achieving satisfactory performance, which is usually unavailable for medical images. Additionally, due to the gap between medical and natural images, the improvement generated by the ImageNet pretrained weights significantly degrades while transferring the weights to medical image processing tasks. In this paper, we propose Bootstrap Own Latent of Transformer (BOLT), a self-supervised learning approach specifically for medical image classification with the Transformer backbone. Our BOLT consists of two networks, namely online and target branches, for self-supervised representation learning. Concretely, the online network is trained to predict the target network representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging for Blood Diseases · AI in cancer detection · COVID-19 diagnosis using AI
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Layer Normalization · Softmax · Adam · Byte Pair Encoding · Residual Connection · Label Smoothing · Dropout
