BiometricBlender: Ultra-high dimensional, multi-class synthetic data generator to imitate biometric feature space
Marcell Stippinger, D\'avid Han\'ak, Marcell T. Kurbucz, Gergely, Hancz\'ar, Oliv\'er M. T\"orteli, Zolt\'an Somogyv\'ari

TL;DR
BiometricBlender is a Python tool that generates ultra-high dimensional, multi-class synthetic biometric data with controllable feature properties to aid benchmarking of feature screening methods.
Contribution
The paper introduces BiometricBlender, a novel synthetic data generator that mimics real biometric datasets for benchmarking feature screening techniques.
Findings
Allows control over feature usefulness and intercorrelations
Generates data that imitates key properties of real biometric datasets
Facilitates benchmarking of feature screening methods
Abstract
The lack of freely available (real-life or synthetic) high or ultra-high dimensional, multi-class datasets may hamper the rapidly growing research on feature screening, especially in the field of biometrics, where the usage of such datasets is common. This paper reports a Python package called BiometricBlender, which is an ultra-high dimensional, multi-class synthetic data generator to benchmark a wide range of feature screening methods. During the data generation process, the overall usefulness and the intercorrelations of blended features can be controlled by the user, thus the synthetic feature space is able to imitate the key properties of a real biometric dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Computational Physics and Python Applications · Remote Sensing and LiDAR Applications
