Reducing Instability in Synthetic Data Evaluation with a Super-Metric in MalDataGen

Anna Luiza Gomes da Silva; Diego Kreutz; Angelo Diniz; Rodrigo Mansilha; Celso Nobre da Fonseca

arXiv:2511.16373·cs.AI·November 21, 2025

Reducing Instability in Synthetic Data Evaluation with a Super-Metric in MalDataGen

Anna Luiza Gomes da Silva, Diego Kreutz, Angelo Diniz, Rodrigo Mansilha, Celso Nobre da Fonseca

PDF

Open Access

TL;DR

This paper introduces a Super-Metric for evaluating synthetic Android malware data, combining multiple metrics into a single score to improve stability and correlation with classifier performance.

Contribution

It presents a novel Super-Metric that aggregates eight metrics across four fidelity dimensions, enhancing evaluation stability in synthetic data generation.

Findings

01

Super-Metric shows higher stability than traditional metrics.

02

Super-Metric correlates more strongly with classifier performance.

03

Demonstrated across ten generative models and five datasets.

Abstract

Evaluating the quality of synthetic data remains a persistent challenge in the Android malware domain due to instability and the lack of standardization among existing metrics. This work integrates into MalDataGen a Super-Metric that aggregates eight metrics across four fidelity dimensions, producing a single weighted score. Experiments involving ten generative models and five balanced datasets demonstrate that the Super-Metric is more stable and consistent than traditional metrics, exhibiting stronger correlations with the actual performance of classifiers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Malware Detection Techniques · Software Testing and Debugging Techniques · Software Engineering Research