Cost-Sensitive Feature Selection of Data with Errors

Hong Zhao; Fan Min; William Zhu

arXiv:1212.3185·cs.LG·June 4, 2013

Cost-Sensitive Feature Selection of Data with Errors

Hong Zhao, Fan Min, William Zhu

PDF

Open Access

TL;DR

This paper introduces a novel approach for cost-sensitive feature selection in data with measurement errors, incorporating test and misclassification costs, and proposes algorithms validated on UCI datasets.

Contribution

It develops a new data model and algorithms for cost-sensitive feature selection considering measurement errors and costs, advancing practical applications in data mining.

Findings

01

Backtracking pruning reduces computational operations.

02

Heuristic algorithm often finds near-optimal solutions.

03

Algorithms perform well on UCI datasets.

Abstract

In data mining applications, feature selection is an essential process since it reduces a model's complexity. The cost of obtaining the feature values must be taken into consideration in many domains. In this paper, we study the cost-sensitive feature selection problem on numerical data with measurement errors, test costs and misclassification costs. The major contributions of this paper are four-fold. First, a new data model is built to address test costs and misclassification costs as well as error boundaries. Second, a covering-based rough set with measurement errors is constructed. Given a confidence interval, the neighborhood is an ellipse in a two-dimension space, or an ellipsoidal in a three-dimension space, etc. Third, a new cost-sensitive feature selection problem is defined on this covering-based rough set. Fourth, both backtracking and heuristic algorithms are proposed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRough Sets and Fuzzy Logic · Data Mining Algorithms and Applications · AI-based Problem Solving and Planning