Cost-Sensitive Machine Learning Classification for Mass Tuberculosis Verbal Screening
Ali Akbar Septiandri, Aditiawarman, Roy Tjiong, Erlina Burhan, Anuraj, Shankar

TL;DR
This study demonstrates that cost-sensitive machine learning models, particularly XGBoost with adjusted class weights, outperform traditional clinician-defined score-based methods in TB verbal screening, achieving higher sensitivity and specificity with limited data.
Contribution
The paper introduces a cost-sensitive machine learning approach for TB verbal screening that improves detection accuracy over traditional methods, even with small datasets.
Findings
XGBoost with class weight adjustment achieved 96.64% sensitivity.
Specificity improved by 13.19% over traditional methods.
Only 2000 data points were needed for model convergence.
Abstract
Score-based algorithms for tuberculosis (TB) verbal screening perform poorly, causing misclassification that leads to missed cases and unnecessary costly laboratory tests for false positives. We compared score-based classification defined by clinicians to machine learning classification such as SVM-RBF, logistic regression, and XGBoost. We restricted our analyses to data from adults, the population most affected by TB, and investigated the difference between untuned and unweighted classifiers to the cost-sensitive ones. Predictions were compared with the corresponding GeneXpert MTB/Rif results. After adjusting the weight of the positive class to 40 for XGBoost, we achieved 96.64% sensitivity and 35.06% specificity. As such, the sensitivity of our identifier increased by 1.26% while specificity increased by 13.19% in absolute value compared to the traditional score-based method defined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTuberculosis Research and Epidemiology · COVID-19 diagnosis using AI · Biomedical Text Mining and Ontologies
