LLM-based Vulnerability Discovery through the Lens of Code Metrics
By: Felix Weissberg , Lukas Pirch , Erik Imgrund and more
Potential Business Impact:
Finds computer bugs by looking at code patterns.
Large language models (LLMs) excel in many tasks of software engineering, yet progress in leveraging them for vulnerability discovery has stalled in recent years. To understand this phenomenon, we investigate LLMs through the lens of classic code metrics. Surprisingly, we find that a classifier trained solely on these metrics performs on par with state-of-the-art LLMs for vulnerability discovery. A root-cause analysis reveals a strong correlation and a causal effect between LLMs and code metrics: When the value of a metric is changed, LLM predictions tend to shift by a corresponding magnitude. This dependency suggests that LLMs operate at a similarly shallow level as code metrics, limiting their ability to grasp complex patterns and fully realize their potential in vulnerability discovery. Based on these findings, we derive recommendations on how research should more effectively address this challenge.
Similar Papers
Everything You Wanted to Know About LLM-based Vulnerability Detection But Were Afraid to Ask
Cryptography and Security
Finds computer bugs better with more code info.
Can LLMs Classify CVEs? Investigating LLMs Capabilities in Computing CVSS Vectors
Cryptography and Security
Helps computers score software flaws more fairly.
Large Language Model (LLM) for Software Security: Code Analysis, Malware Analysis, Reverse Engineering
Cryptography and Security
Helps computers find computer viruses faster.