Parallel feature selection based on the trace ratio criterion
Thu Nguyen, Thanh Nhan Phan, Van Nhuong Nguyen, Thanh Binh Nguyen,, P{\aa}l Halvorsen, Michael Riegler

TL;DR
This paper introduces PFST, a scalable parallel feature selection method using the trace criterion, which efficiently identifies important features for classification in large datasets, improving both speed and accuracy.
Contribution
The paper presents a novel parallel feature selection algorithm based on the trace criterion, capable of handling very large datasets with improved efficiency and classification performance.
Findings
PFST reduces feature selection time significantly compared to other methods.
Features selected by PFST lead to higher classification accuracy.
PFST effectively balances feature relevance and redundancy in large datasets.
Abstract
The growth of data today poses a challenge in management and inference. While feature extraction methods are capable of reducing the size of the data for inference, they do not help in minimizing the cost of data storage. On the other hand, feature selection helps to remove the redundant features and therefore is helpful not only in inference but also in reducing management costs. This work presents a novel parallel feature selection approach for classification, namely Parallel Feature Selection using Trace criterion (PFST), which scales up to very large datasets. Our method uses trace criterion, a measure of class separability used in Fisher's Discriminant Analysis, to evaluate feature usefulness. We analyzed the criterion's desirable properties theoretically. Based on the criterion, PFST rapidly finds important features out of a set of features for big datasets by first making a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Machine Learning and Data Classification · Machine Learning and ELM
MethodsFeature Selection
