Cost-Sensitive Feature Selection of Data with Errors
Hong Zhao, Fan Min, William Zhu

TL;DR
This paper introduces a novel approach for cost-sensitive feature selection in data with measurement errors, incorporating test and misclassification costs, and proposes algorithms validated on UCI datasets.
Contribution
It develops a new data model and algorithms for cost-sensitive feature selection considering measurement errors and costs, advancing practical applications in data mining.
Findings
Backtracking pruning reduces computational operations.
Heuristic algorithm often finds near-optimal solutions.
Algorithms perform well on UCI datasets.
Abstract
In data mining applications, feature selection is an essential process since it reduces a model's complexity. The cost of obtaining the feature values must be taken into consideration in many domains. In this paper, we study the cost-sensitive feature selection problem on numerical data with measurement errors, test costs and misclassification costs. The major contributions of this paper are four-fold. First, a new data model is built to address test costs and misclassification costs as well as error boundaries. Second, a covering-based rough set with measurement errors is constructed. Given a confidence interval, the neighborhood is an ellipse in a two-dimension space, or an ellipsoidal in a three-dimension space, etc. Third, a new cost-sensitive feature selection problem is defined on this covering-based rough set. Fourth, both backtracking and heuristic algorithms are proposed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRough Sets and Fuzzy Logic · Data Mining Algorithms and Applications · AI-based Problem Solving and Planning
