Feature space reduction method for ultrahigh-dimensional, multiclass data: Random forest-based multiround screening (RFMS)
Gergely Hancz\'ar, Marcell Stippinger, D\'avid Han\'ak, Marcell T. Kurbucz, Oliv\'er M. T\"orteli, \'Agnes Chripk\'o, Zolt\'an Somogyv\'ari

TL;DR
This paper introduces RFMS, a novel feature reduction method tailored for ultrahigh-dimensional, multiclass data, especially effective for biometric applications with thousands of classes, outperforming existing screening techniques.
Contribution
RFMS is a new multiround screening algorithm that efficiently reduces feature space in ultrahigh-dimensional, multiclass datasets, addressing limitations of previous methods.
Findings
RFMS performs comparably to industry-standard methods.
RFMS effectively handles data with thousands of classes.
RFMS offers advantages such as efficiency and scalability.
Abstract
In recent years, numerous screening methods have been published for ultrahigh-dimensional data that contain hundreds of thousands of features; however, most of these features cannot handle data with thousands of classes. Prediction models built to authenticate users based on multichannel biometric data result in this type of problem. In this study, we present a novel method known as random forest-based multiround screening (RFMS) that can be effectively applied under such circumstances. The proposed algorithm divides the feature space into small subsets and executes a series of partial model builds. These partial models are used to implement tournament-based sorting and the selection of features based on their importance. To benchmark RFMS, a synthetic biometric feature space generator known as BiometricBlender is employed. Based on the results, the RFMS is on par with industry-standard…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Image Retrieval and Classification Techniques
