Score: 1

Vulnerability-Affected Versions Identification: How Far Are We?

Published: September 4, 2025 | arXiv ID: 2509.03876v2

By: Xingchu Chen , Chengwei Liu , Jialun Cao and more

Potential Business Impact:

Finds which computer programs have security problems.

Business Areas:
Penetration Testing Information Technology, Privacy and Security

Identifying which software versions are affected by a vulnerability is critical for patching, risk mitigation. Despite a growing body of tools, their real-world effectiveness remains unclear due to narrow evaluation scopes often limited to early SZZ variants, outdated techniques, and small or coarse-grained datasets. In this paper, we present the first comprehensive empirical study of vulnerability affected versions identification. We curate a high quality benchmark of 1,128 real-world C/C++ vulnerabilities and systematically evaluate 12 representative tools from both tracing and matching paradigms across four dimensions: effectiveness at both vulnerability and version levels, root causes of false positives and negatives, sensitivity to patch characteristics, and ensemble potential. Our findings reveal fundamental limitations: no tool exceeds 45.0% accuracy, with key challenges stemming from heuristic dependence, limited semantic reasoning, and rigid matching logic. Patch structures such as add-only and cross-file changes further hinder performance. Although ensemble strategies can improve results by up to 10.1%, overall accuracy remains below 60.0%, highlighting the need for fundamentally new approaches. Moreover, our study offers actionable insights to guide tool development, combination strategies, and future research in this critical area. Finally, we release the replicated code and benchmark on our website to encourage future contributions.

Country of Origin
πŸ‡ΈπŸ‡¬ πŸ‡­πŸ‡° Singapore, Hong Kong

Page Count
13 pages

Category
Computer Science:
Software Engineering