Score: 0

Reducing Instability in Synthetic Data Evaluation with a Super-Metric in MalDataGen

Published: November 20, 2025 | arXiv ID: 2511.16373v1

By: Anna Luiza Gomes da Silva , Diego Kreutz , Angelo Diniz and more

Potential Business Impact:

Makes fake virus data better for training phone security.

Business Areas:
Intelligent Systems Artificial Intelligence, Data and Analytics, Science and Engineering

Evaluating the quality of synthetic data remains a persistent challenge in the Android malware domain due to instability and the lack of standardization among existing metrics. This work integrates into MalDataGen a Super-Metric that aggregates eight metrics across four fidelity dimensions, producing a single weighted score. Experiments involving ten generative models and five balanced datasets demonstrate that the Super-Metric is more stable and consistent than traditional metrics, exhibiting stronger correlations with the actual performance of classifiers.

Page Count
5 pages

Category
Computer Science:
Artificial Intelligence