Reducing Instability in Synthetic Data Evaluation with a Super-Metric in MalDataGen
Anna Luiza Gomes da Silva, Diego Kreutz, Angelo Diniz, Rodrigo Mansilha, Celso Nobre da Fonseca

TL;DR
This paper introduces a Super-Metric for evaluating synthetic Android malware data, combining multiple metrics into a single score to improve stability and correlation with classifier performance.
Contribution
It presents a novel Super-Metric that aggregates eight metrics across four fidelity dimensions, enhancing evaluation stability in synthetic data generation.
Findings
Super-Metric shows higher stability than traditional metrics.
Super-Metric correlates more strongly with classifier performance.
Demonstrated across ten generative models and five datasets.
Abstract
Evaluating the quality of synthetic data remains a persistent challenge in the Android malware domain due to instability and the lack of standardization among existing metrics. This work integrates into MalDataGen a Super-Metric that aggregates eight metrics across four fidelity dimensions, producing a single weighted score. Experiments involving ten generative models and five balanced datasets demonstrate that the Super-Metric is more stable and consistent than traditional metrics, exhibiting stronger correlations with the actual performance of classifiers.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Software Testing and Debugging Techniques · Software Engineering Research
