Developing a Dual-Stage Vision Transformer Model for Lung Disease   Classification

Anirudh Mazumder; Jianguo Liu

arXiv:2409.18257·eess.IV·April 4, 2025

Developing a Dual-Stage Vision Transformer Model for Lung Disease Classification

Anirudh Mazumder, Jianguo Liu

PDF

Open Access

TL;DR

This paper introduces a dual-stage vision transformer model combining ViT and Swin Transformer to classify 14 lung diseases from X-ray images, achieving high accuracy and aiding in rapid diagnosis.

Contribution

It presents a novel dual-stage transformer architecture specifically designed for lung disease classification from X-ray scans, integrating two transformer models for improved accuracy.

Findings

01

Achieved 92.06% accuracy on unseen test data.

02

Effective in classifying 14 different lung diseases.

03

Demonstrated potential for aiding clinical diagnosis.

Abstract

Lung diseases have become a prevalent problem throughout the United States, affecting over 34 million people. Accurate and timely diagnosis of the different types of lung diseases is critical, and Artificial Intelligence (AI) methods could speed up these processes. A dual-stage vision transformer is built throughout this research by integrating a Vision Transformer (ViT) and a Swin Transformer to classify 14 different lung diseases from X-ray scans of patients with these diseases. The proposed model achieved an accuracy of 92.06% on a label-level when making predictions on an unseen testing subset of the dataset after data preprocessing and training the neural network. The model showed promise for accurately classifying lung diseases and diagnosing patients who suffer from these harmful diseases.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBrain Tumor Detection and Classification · COVID-19 diagnosis using AI

MethodsAttention Is All You Need · Layer Normalization · Adam · Linear Layer · Residual Connection · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Vision Transformer