Addressing Challenges in Data Quality and Model Generalization for Malaria Detection
Kiswendsida Kisito Kabore, Desire Guel

TL;DR
This paper analyzes data quality and model generalization challenges in AI-based malaria detection, proposing solutions like GAN augmentation and domain adaptation to improve accuracy and robustness in resource-limited settings.
Contribution
It offers a comprehensive analysis of challenges and introduces effective methods such as GAN-based augmentation and transfer learning to enhance malaria detection models.
Findings
Data imbalance causes up to 20% F1-score reduction.
Synthetic data improves accuracy by 15-20%.
Domain adaptation increases sensitivity by 25%.
Abstract
Malaria remains a significant global health burden, particularly in resource-limited regions where timely and accurate diagnosis is critical to effective treatment and control. Deep Learning (DL) has emerged as a transformative tool for automating malaria detection and it offers high accuracy and scalability. However, the effectiveness of these models is constrained by challenges in data quality and model generalization including imbalanced datasets, limited diversity and annotation variability. These issues reduce diagnostic reliability and hinder real-world applicability. This article provides a comprehensive analysis of these challenges and their implications for malaria detection performance. Key findings highlight the impact of data imbalances which can lead to a 20\% drop in F1-score and regional biases which significantly hinder model generalization. Proposed solutions, such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
