Better Call Graphs: A New Dataset of Function Call Graphs for Malware Classification
By: Jakir Hossain , Gurvinder Singh , Lukasz Ziarek and more
Potential Business Impact:
Creates better tools to find phone viruses.
Function call graphs (FCGs) have emerged as a powerful abstraction for malware detection, capturing the behavioral structure of applications beyond surface-level signatures. Their utility in traditional program analysis has been well established, enabling effective classification and analysis of malicious software. In the mobile domain, especially in the Android ecosystem, FCG-based malware classification is particularly critical due to the platform's widespread adoption and the complex, component-based structure of Android apps. However, progress in this direction is hindered by the lack of large-scale, high-quality Android-specific FCG datasets. Existing datasets are often outdated, dominated by small or redundant graphs resulting from app repackaging, and fail to reflect the diversity of real-world malware. These limitations lead to overfitting and unreliable evaluation of graph-based classification methods. To address this gap, we introduce Better Call Graphs (BCG), a comprehensive dataset of large and unique FCGs extracted from recent Android application packages (APKs). BCG includes both benign and malicious samples spanning various families and types, along with graph-level features for each APK. Through extensive experiments using baseline classifiers, we demonstrate the necessity and value of BCG compared to existing datasets. BCG is publicly available at https://erdemub.github.io/BCG-dataset.
Similar Papers
HiGraph: A Large-Scale Hierarchical Graph Dataset for Malware Analysis
Machine Learning (CS)
Helps computers spot bad software better.
Mitigating Distribution Shift in Graph-Based Android Malware Classification via Function Metadata and LLM Embeddings
Cryptography and Security
Finds hidden computer virus patterns better.
FCGHunter: Towards Evaluating Robustness of Graph-Based Android Malware Detection
Cryptography and Security
Finds weaknesses in phone virus detectors.