Score: 0

BinCtx: Multi-Modal Representation Learning for Robust Android App Behavior Detection

Published: October 16, 2025 | arXiv ID: 2510.14344v1

By: Zichen Liu, Shao Yang, Xusheng Xiao

Potential Business Impact:

Finds bad apps by looking at code and how they work.

Business Areas:
Biometrics Biotechnology, Data and Analytics, Science and Engineering

Mobile app markets host millions of apps, yet undesired behaviors (e.g., disruptive ads, illegal redirection, payment deception) remain hard to catch because they often do not rely on permission-protected APIs and can be easily camouflaged via UI or metadata edits. We present BINCTX, a learning approach that builds multi-modal representations of an app from (i) a global bytecode-as-image view that captures code-level semantics and family-style patterns, (ii) a contextual view (manifested actions, components, declared permissions, URL/IP constants) indicating how behaviors are triggered, and (iii) a third-party-library usage view summarizing invocation frequencies along inter-component call paths. The three views are embedded and fused to train a contextual-aware classifier. On real-world malware and benign apps, BINCTX attains a macro F1 of 94.73%, outperforming strong baselines by at least 14.92%. It remains robust under commercial obfuscation (F1 84% post-obfuscation) and is more resistant to adversarial samples than state-of-the-art bytecode-only systems.

Page Count
14 pages

Category
Computer Science:
Cryptography and Security