A New Method for Classification of Datasets for Data Mining

Singh Vijendra; Hemjyotsana Parashar; Nisha Vasudeva

arXiv:1612.00151·cs.LG·December 2, 2016

A New Method for Classification of Datasets for Data Mining

Singh Vijendra, Hemjyotsana Parashar, Nisha Vasudeva

PDF

Open Access

TL;DR

This paper introduces an improved decision tree algorithm that addresses ID3's tendency to favor attributes with many values, enhancing classification accuracy and efficiency.

Contribution

The paper proposes a novel attribute grouping method and selection measure to improve decision tree classification over ID3.

Findings

01

More accurate data set classification

02

Enhanced efficiency in decision tree construction

03

Better handling of attributes with many values

Abstract

Decision tree is an important method for both induction research and data mining, which is mainly used for model classification and prediction. ID3 algorithm is the most widely used algorithm in the decision tree so far. In this paper, the shortcoming of ID3's inclining to choose attributes with many values is discussed, and then a new decision tree algorithm which is improved version of ID3. In our proposed algorithm attributes are divided into groups and then we apply the selection measure 5 for these groups. If information gain is not good then again divide attributes values into groups. These steps are done until we get good classification/misclassification ratio. The proposed algorithms classify the data sets more accurately and efficiently.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Data Management and Algorithms · Time Series Analysis and Forecasting