Are Data Augmentation and Segmentation Always Necessary? Insights from COVID-19 X-Rays and a Methodology Thereof
Aman Swaraj, Arnav Agarwal, Hitendra Singh Bhadouria, Sandeep Kumar, Karan Verma

TL;DR
This paper critically examines the necessity of lung segmentation and data augmentation in COVID-19 X-ray classification, proposing a methodology that improves reliability and reduces overfitting.
Contribution
It introduces SDL-COVID, a methodology that emphasizes lung segmentation and optimal data augmentation for more accurate COVID-19 detection from chest X-rays.
Findings
Lung segmentation is essential for accurate COVID-19 prediction.
Excessive data augmentation leads to overfitting and reduced test accuracy.
SDL-COVID achieves 95.21% precision with lower false negatives.
Abstract
Purpose: Rapid and reliable diagnostic tools are crucial for managing respiratory diseases like COVID-19, where chest X-ray analysis coupled with artificial intelligence techniques has proven invaluable. However, most existing works on X-ray images have not considered lung segmentation, raising concerns about their reliability. Additionally, some have employed disproportionate and impractical augmentation techniques, making models less generalized and prone to overfitting. This study presents a critical analysis of both issues and proposes a methodology (SDL-COVID) for more reliable classification of chest X-rays for COVID-19 detection. Methods: We use class activation mapping to obtain a visual understanding of the predictions made by Convolutional Neural Networks (CNNs), validating the necessity of lung segmentation. To analyze the effect of data augmentation, deep learning models are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
