CRDT: Correlation Ratio Based Decision Tree Model for Healthcare Data Mining
Smita Roy, Samrat Mondal, Asif Ekbal

TL;DR
This paper introduces a novel CR-based decision tree model tailored for healthcare data, addressing biases of traditional methods and demonstrating improved performance on benchmark datasets.
Contribution
The paper proposes a correlation ratio-based decision tree model that reduces attribute bias, enhancing classification accuracy for healthcare datasets.
Findings
Effective on benchmark healthcare datasets
Reduces bias towards attributes with many distinct values
Improves classification performance
Abstract
The phenomenal growth in the healthcare data has inspired us in investigating robust and scalable models for data mining. For classification problems Information Gain(IG) based Decision Tree is one of the popular choices. However, depending upon the nature of the dataset, IG based Decision Tree may not always perform well as it prefers the attribute with more number of distinct values as the splitting attribute. Healthcare datasets generally have many attributes and each attribute generally has many distinct values. In this paper, we have tried to focus on this characteristics of the datasets while analysing the performance of our proposed approach which is a variant of Decision Tree model and uses the concept of Correlation Ratio(CR). Unlike IG based approach, this CR based approach has no biasness towards the attribute with more number of distinct values. We have applied our model on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
