Score: 0

Policy-Value Guided MDP-MCTS Framework for Cyber Kill-Chain Inference

Published: December 17, 2025 | arXiv ID: 2512.15150v1

By: Chitraksh Singh, Monisha Dhanraj, Ken Huang

Potential Business Impact:

Builds complete hacker attack maps from reports.

Business Areas:
Intrusion Detection Information Technology, Privacy and Security

Threat analysts routinely rely on natural-language reports that describe attacker actions without enumerating the full kill chain or the dependencies between phases, making automated reconstruction of ATT&CK consistent intrusion paths a difficult open problem. We propose a reasoning framework that infers complete seven-phase kill chains by coupling phase-conditioned semantic priors from Transformer models with a symbolic Markov Decision Process and an AlphaZero-style Monte Carlo Tree Search guided by a Policy-Value Network. The framework enforces semantic relevance, phase cohesion, and transition plausibility through a multi-objective reward function while allowing search to explore alternative interpretations of the CTI narrative. Applied to three real intrusions FIN6, APT24, and UNC1549 the approach yields kill chains that surpass Transformer baselines in semantic fidelity and operational coherence, and frequently align with expert-selected TTPs. Our results demonstrate that combining contextual embeddings with search-based decision-making offers a practical path toward automated, interpretable kill-chain reconstruction for cyber defense.

Page Count
8 pages

Category
Computer Science:
Cryptography and Security