Improving Performance of a Group of Classification Algorithms Using   Resampling and Feature Selection

Mehdi Naseriparsa; Amir-masoud Bidgoli; Touraj Varaee

arXiv:1403.1946·cs.LG·March 11, 2014·1 cites

Improving Performance of a Group of Classification Algorithms Using Resampling and Feature Selection

Mehdi Naseriparsa, Amir-masoud Bidgoli, Touraj Varaee

PDF

Open Access

TL;DR

This paper introduces a hybrid resampling and feature selection method that improves classification accuracy and reduces errors across multiple algorithms on a lung cancer dataset.

Contribution

It presents a novel combination of resampling, filtering, and genetic search for feature selection that outperforms existing methods in accuracy and cost.

Findings

01

Significant reduction in classification errors.

02

Improved average performance of five classifiers.

03

Outperforms other feature selection techniques.

Abstract

In recent years the importance of finding a meaningful pattern from huge datasets has become more challenging. Data miners try to adopt innovative methods to face this problem by applying feature selection methods. In this paper we propose a new hybrid method in which we use a combination of resampling, filtering the sample domain and wrapper subset evaluation method with genetic search to reduce dimensions of Lung-Cancer dataset that we received from UCI Repository of Machine Learning databases. Finally, we apply some well- known classification algorithms (Na\"ive Bayes, Logistic, Multilayer Perceptron, Best First Decision Tree and JRIP) to the resulting dataset and compare the results and prediction rates before and after the application of our feature selection method on that dataset. The results show a substantial progress in the average performance of five classification algorithms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Mining Algorithms and Applications · Text and Document Classification Technologies · Imbalanced Data Classification Techniques