Score: 1

Learning to Triage Taint Flows Reported by Dynamic Program Analysis in Node.js Packages

Published: October 23, 2025 | arXiv ID: 2510.20739v1

By: Ronghao Ni , Aidan Z. H. Yang , Min-Chien Hsu and more

BigTech Affiliations: Amazon

Potential Business Impact:

Helps find computer bugs faster and easier.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Program analysis tools often produce large volumes of candidate vulnerability reports that require costly manual review, creating a practical challenge: how can security analysts prioritize the reports most likely to be true vulnerabilities? This paper investigates whether machine learning can be applied to prioritizing vulnerabilities reported by program analysis tools. We focus on Node.js packages and collect a benchmark of 1,883 Node.js packages, each containing one reported ACE or ACI vulnerability. We evaluate a variety of machine learning approaches, including classical models, graph neural networks (GNNs), large language models (LLMs), and hybrid models that combine GNN and LLMs, trained on data based on a dynamic program analysis tool's output. The top LLM achieves $F_{1} {=} 0.915$, while the best GNN and classical ML models reaching $F_{1} {=} 0.904$. At a less than 7% false-negative rate, the leading model eliminates 66.9% of benign packages from manual review, taking around 60 ms per package. If the best model is tuned to operate at a precision level of 0.8 (i.e., allowing 20% false positives amongst all warnings), our approach can detect 99.2% of exploitable taint flows while missing only 0.8%, demonstrating strong potential for real-world vulnerability triage.

Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization

Cryptography and Security

Helps computers find computer security risks faster.

21 Oct 2025 0

87%

Towards Classifying Benign And Malicious Packages Using Machine Learning

Cryptography and Security

Finds bad computer code before it causes harm.

19 Nov 2025 0

87%

Evaluating Line-level Localization Ability of Learning-based Code Vulnerability Detection Models

Machine Learning (CS)

Finds exact code errors, not just whole programs.

13 Oct 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

17 pages

Learning to Triage Taint Flows Reported by Dynamic Program Analysis in Node.js Packages

Helps find computer bugs faster and easier.

Technical Abstract

Prompting the Priorities: A First Look at Evaluating LLMs for Vulnerability Triage and Prioritization

Towards Classifying Benign And Malicious Packages Using Machine Learning

Evaluating Line-level Localization Ability of Learning-based Code Vulnerability Detection Models