Artificial intelligence application in lymphoma diagnosis with Vision Transformer using weakly supervised training

Nghia (Andy) Nguyen; Amer Wahed; Andy Quesada; Yasir Ali; Hanadi El Achi; Y. Helen Zhang; Jocelyn Ursua; Alex Banerjee; Sahib Kalra; L. Jeffrey Medeiros; Jie Xu

arXiv:2604.13795·cs.CV·April 16, 2026

Artificial intelligence application in lymphoma diagnosis with Vision Transformer using weakly supervised training

Nghia (Andy) Nguyen, Amer Wahed, Andy Quesada, Yasir Ali, Hanadi El Achi, Y. Helen Zhang, Jocelyn Ursua, Alex Banerjee, Sahib Kalra, L. Jeffrey Medeiros, Jie Xu

PDF

TL;DR

This study demonstrates that Vision Transformers trained with weakly supervised learning on large datasets can effectively classify lymphoma types in medical images, offering a practical alternative to fully supervised methods.

Contribution

The paper introduces a weakly supervised training approach for Vision Transformers in lymphoma diagnosis, reducing the need for expert-labeled data while maintaining high accuracy.

Findings

01

Achieved 91.85% accuracy, 0.92 F1 score, and 0.98 AUC with weakly supervised ViT.

02

Weakly supervised training on 100,000 image patches outperforms small fully supervised models.

03

ViT with weak supervision is suitable for clinical deep learning applications.

Abstract

Vision transformers (ViT) have been shown to allow for more flexible feature detection and can outperform convolutional neural network (CNN) when pre-trained on sufficient data. Due to their promising feature detection capabilities, we deployed ViTs for morphological classification of anaplastic large cell lymphoma (ALCL) versus classic Hodgkin lymphoma (cHL). We had previously designed a ViT model which was trained on a small dataset of 1,200 image patches in fully supervised training. That model achieved a diagnostic accuracy of 100% and an F1 score of 1.0 on the independent test set. Since fully supervised training is not a practical method due to lack of expertise resources in both the training and testing phases, we conducted a recent study on a modified approach to training data (weakly supervised training) and show that labeling training image patch automatically at the slide…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.