Artificial intelligence application in lymphoma diagnosis with Vision Transformer using weakly supervised training
Nghia (Andy) Nguyen, Amer Wahed, Andy Quesada, Yasir Ali, Hanadi El Achi, Y. Helen Zhang, Jocelyn Ursua, Alex Banerjee, Sahib Kalra, L. Jeffrey Medeiros, Jie Xu

TL;DR
This study demonstrates that Vision Transformers trained with weakly supervised learning on large datasets can effectively classify lymphoma types in medical images, offering a practical alternative to fully supervised methods.
Contribution
The paper introduces a weakly supervised training approach for Vision Transformers in lymphoma diagnosis, reducing the need for expert-labeled data while maintaining high accuracy.
Findings
Achieved 91.85% accuracy, 0.92 F1 score, and 0.98 AUC with weakly supervised ViT.
Weakly supervised training on 100,000 image patches outperforms small fully supervised models.
ViT with weak supervision is suitable for clinical deep learning applications.
Abstract
Vision transformers (ViT) have been shown to allow for more flexible feature detection and can outperform convolutional neural network (CNN) when pre-trained on sufficient data. Due to their promising feature detection capabilities, we deployed ViTs for morphological classification of anaplastic large cell lymphoma (ALCL) versus classic Hodgkin lymphoma (cHL). We had previously designed a ViT model which was trained on a small dataset of 1,200 image patches in fully supervised training. That model achieved a diagnostic accuracy of 100% and an F1 score of 1.0 on the independent test set. Since fully supervised training is not a practical method due to lack of expertise resources in both the training and testing phases, we conducted a recent study on a modified approach to training data (weakly supervised training) and show that labeling training image patch automatically at the slide…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
