Vision Transformer-based Model for Severity Quantification of Lung Pneumonia Using Chest X-ray Images
Bouthaina Slika, Fadi Dornaika, Hamid Merdji, Karim Hammoudi

TL;DR
This paper introduces ViTReg-IP, a Vision Transformer-based model that efficiently quantifies COVID-19 severity from chest X-rays with high accuracy and generalizability, requiring fewer parameters and less computational resources.
Contribution
The paper presents a novel ViT-based regression model for COVID-19 severity assessment that outperforms existing methods in accuracy and efficiency, with demonstrated robustness across multiple datasets.
Findings
High accuracy in severity quantification
Strong generalization across datasets
Low computational cost
Abstract
To develop generic and reliable approaches for diagnosing and assessing the severity of COVID-19 from chest X-rays (CXR), a large number of well-maintained COVID-19 datasets are needed. Existing severity quantification architectures require expensive training calculations to achieve the best results. For healthcare professionals to quickly and automatically identify COVID-19 patients and predict associated severity indicators, computer utilities are needed. In this work, we propose a Vision Transformer (ViT)-based neural network model that relies on a small number of trainable parameters to quantify the severity of COVID-19 and other lung diseases. We present a feasible approach to quantify the severity of CXR, called Vision Transformer Regressor Infection Prediction (ViTReg-IP), derived from a ViT and a regression head. We investigate the generalization potential of our model using a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCOVID-19 diagnosis using AI · AI in cancer detection · Machine Learning in Healthcare
MethodsMulti-Head Attention · Attention Is All You Need · Test · Linear Layer · Absolute Position Encodings · Label Smoothing · Byte Pair Encoding · Residual Connection · Dropout · Layer Normalization
