Using Genetic Algorithms to Optimise Rough Set Partition Sizes for HIV Data Analysis
Bodie Crossingham, Tshilidzi Marwala

TL;DR
This paper introduces a genetic algorithm-based method to optimize rough set partition sizes, significantly improving HIV data classification accuracy from 57.7% to 72.8%.
Contribution
It presents a novel approach combining genetic algorithms with rough set theory to enhance rule-based HIV data analysis.
Findings
Optimized partitions increased prediction accuracy from 57.7% to 72.8%.
Rough set theory provides interpretable rules for HIV classification.
The method outperforms equal width bin partitioning and other analysis techniques.
Abstract
In this paper, we present a method to optimise rough set partition sizes, to which rule extraction is performed on HIV data. The genetic algorithm optimisation technique is used to determine the partition sizes of a rough set in order to maximise the rough sets prediction accuracy. The proposed method is tested on a set of demographic properties of individuals obtained from the South African antenatal survey. Six demographic variables were used in the analysis, these variables are; race, age of mother, education, gravidity, parity, and age of father, with the outcome or decision being either HIV positive or negative. Rough set theory is chosen based on the fact that it is easy to interpret the extracted rules. The prediction accuracy of equal width bin partitioning is 57.7% while the accuracy achieved after optimising the partitions is 72.8%. Several other methods have been used to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Data Mining Algorithms and Applications
