Score: 1

PROVSYN: Synthesizing Provenance Graphs for Data Augmentation in Intrusion Detection Systems

Published: June 6, 2025 | arXiv ID: 2506.06226v1

By: Yi Huang , Wajih UI Hassan , Yao Guo and more

Potential Business Impact:

Makes computers better at finding sneaky hackers.

Business Areas:

Predictive Analytics Artificial Intelligence, Data and Analytics, Software

Provenance graph analysis plays a vital role in intrusion detection, particularly against Advanced Persistent Threats (APTs), by exposing complex attack patterns. While recent systems combine graph neural networks (GNNs) with natural language processing (NLP) to capture structural and semantic features, their effectiveness is limited by class imbalance in real-world data. To address this, we introduce PROVSYN, an automated framework that synthesizes provenance graphs through a three-phase pipeline: (1) heterogeneous graph structure synthesis with structural-semantic modeling, (2) rule-based topological refinement, and (3) context-aware textual attribute synthesis using large language models (LLMs). PROVSYN includes a comprehensive evaluation framework that integrates structural, textual, temporal, and embedding-based metrics, along with a semantic validation mechanism to assess the correctness of generated attack patterns and system behaviors. To demonstrate practical utility, we use the synthetic graphs to augment training datasets for downstream APT detection models. Experimental results show that PROVSYN produces high-fidelity graphs and improves detection performance through effective data augmentation.

ProvX: Generating Counterfactual-Driven Attack Explanations for Provenance-Based Detection

Cryptography and Security

Explains how computer attacks happen to stop them.

8 Aug 2025 2

89%

Distributed Temporal Graph Learning with Provenance for APT Detection in Supply Chains

Cryptography and Security

Finds sneaky computer attacks hidden in software.

3 Apr 2025 0

88%

Knowledge Transfer from LLMs to Provenance Analysis: A Semantic-Augmented Method for APT Detection

Cryptography and Security

Finds hidden computer attacks using smart AI.

24 Mar 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 🇨🇳 China, United States

Page Count

18 pages

PROVSYN: Synthesizing Provenance Graphs for Data Augmentation in Intrusion Detection Systems

Makes computers better at finding sneaky hackers.

Technical Abstract

ProvX: Generating Counterfactual-Driven Attack Explanations for Provenance-Based Detection

Distributed Temporal Graph Learning with Provenance for APT Detection in Supply Chains

Knowledge Transfer from LLMs to Provenance Analysis: A Semantic-Augmented Method for APT Detection