A Data Mining Approach to the Diagnosis of Tuberculosis by Cascading Clustering and Classification
Asha.T, S. Natarajan, K.N.B. Murthy

TL;DR
This paper presents a combined clustering and classification approach using K-means and SVM to accurately diagnose and categorize tuberculosis, aiding medical decision-making.
Contribution
It introduces a novel hybrid methodology that integrates clustering and multiple classifiers, achieving high accuracy in TB diagnosis and classification.
Findings
Support Vector Machine achieved 98.7% accuracy.
Clustering helped improve classification performance.
Method aids doctors in diagnosis and treatment planning.
Abstract
In this paper, a methodology for the automated detection and classification of Tuberculosis(TB) is presented. Tuberculosis is a disease caused by mycobacterium which spreads through the air and attacks low immune bodies easily. Our methodology is based on clustering and classification that classifies TB into two categories, Pulmonary Tuberculosis(PTB) and retroviral PTB(RPTB) that is those with Human Immunodeficiency Virus (HIV) infection. Initially K-means clustering is used to group the TB data into two clusters and assigns classes to clusters. Subsequently multiple different classification algorithms are trained on the result set to build the final classifier model based on K-fold cross validation method. This methodology is evaluated using 700 raw TB data obtained from a city hospital. The best obtained accuracy was 98.7% from support vector machine (SVM) compared to other…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText and Document Classification Technologies · Advanced Clustering Algorithms Research · Image Retrieval and Classification Techniques
