Score: 0

Better Call Graphs: A New Dataset of Function Call Graphs for Malware Classification

Published: December 24, 2025 | arXiv ID: 2512.20872v1

By: Jakir Hossain , Gurvinder Singh , Lukasz Ziarek and more

Potential Business Impact:

Creates better tools to find phone viruses.

Business Areas:
Big Data Data and Analytics

Function call graphs (FCGs) have emerged as a powerful abstraction for malware detection, capturing the behavioral structure of applications beyond surface-level signatures. Their utility in traditional program analysis has been well established, enabling effective classification and analysis of malicious software. In the mobile domain, especially in the Android ecosystem, FCG-based malware classification is particularly critical due to the platform's widespread adoption and the complex, component-based structure of Android apps. However, progress in this direction is hindered by the lack of large-scale, high-quality Android-specific FCG datasets. Existing datasets are often outdated, dominated by small or redundant graphs resulting from app repackaging, and fail to reflect the diversity of real-world malware. These limitations lead to overfitting and unreliable evaluation of graph-based classification methods. To address this gap, we introduce Better Call Graphs (BCG), a comprehensive dataset of large and unique FCGs extracted from recent Android application packages (APKs). BCG includes both benign and malicious samples spanning various families and types, along with graph-level features for each APK. Through extensive experiments using baseline classifiers, we demonstrate the necessity and value of BCG compared to existing datasets. BCG is publicly available at https://erdemub.github.io/BCG-dataset.

Country of Origin
🇺🇸 United States

Page Count
13 pages

Category
Computer Science:
Cryptography and Security