More for Less: Compact Convolutional Transformers Enable Robust Medical Image Classification with Limited Data
Andrew Kean Gao

TL;DR
This paper demonstrates that Compact Convolutional Transformers (CCT) can effectively classify medical images with limited data, achieving high accuracy and robustness compared to traditional transformers.
Contribution
The study introduces and validates the use of CCTs for medical image classification in data-scarce environments, showing their superior performance over conventional models.
Findings
Achieved 92.49% accuracy on blood cell classification
CCT learned quickly, reaching 80% accuracy after five epochs
Performance was strong across all cell types
Abstract
Transformers are very powerful tools for a variety of tasks across domains, from text generation to image captioning. However, transformers require substantial amounts of training data, which is often a challenge in biomedical settings, where high quality labeled data can be challenging or expensive to obtain. This study investigates the efficacy of Compact Convolutional Transformers (CCT) for robust medical image classification with limited data, addressing a key issue faced by conventional Vision Transformers - their requirement for large datasets. A hybrid of transformers and convolutional layers, CCTs demonstrate high accuracy on modestly sized datasets. We employed a benchmark dataset of peripheral blood cell images of eight distinct cell types, each represented by approximately 2,000 low-resolution (28x28x3 pixel) samples. Despite the dataset size being smaller than those…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDigital Imaging for Blood Diseases · Cell Image Analysis Techniques · COVID-19 diagnosis using AI
MethodsMulti-Head Attention · Attention Is All You Need · Layer Normalization · Absolute Position Encodings · Byte Pair Encoding · Linear Layer · Label Smoothing · Adam · Position-Wise Feed-Forward Layer · Residual Connection
