Score: 0

SoK: Understanding (New) Security Issues Across AI4Code Use Cases

Published: December 20, 2025 | arXiv ID: 2512.18456v1

By: Qilong Wu , Taoran Li , Tianyang Zhou and more

AI-for-Code (AI4Code) systems are reshaping software engineering, with tools like GitHub Copilot accelerating code generation, translation, and vulnerability detection. Alongside these advances, however, security risks remain pervasive: insecure outputs, biased benchmarks, and susceptibility to adversarial manipulation undermine their reliability. This SoK surveys the landscape of AI4Code security across three core applications, identifying recurring gaps: benchmark dominance by Python and toy problems, lack of standardized security datasets, data leakage in evaluation, and fragile adversarial robustness. A comparative study of six state-of-the-art models illustrates these challenges: insecure patterns persist in code generation, vulnerability detection is brittle to semantic-preserving attacks, fine-tuning often misaligns security objectives, and code translation yields uneven security benefits. From this analysis, we distill three forward paths: embedding secure-by-default practices in code generation, building robust and comprehensive detection benchmarks, and leveraging translation as a route to security-enhanced languages. We call for a shift toward security-first AI4Code, where vulnerability mitigation and robustness are embedded throughout the development life cycle.

Security Vulnerabilities in AI-Generated Code: A Large-Scale Analysis of Public GitHub Repositories

Cryptography and Security

Finds hidden mistakes in computer code made by AI.

30 Oct 2025 1

88%

Taught by the Flawed: How Dataset Insecurity Breeds Vulnerable AI Code

Cryptography and Security

Makes AI write safer computer code.

13 Nov 2025 0

88%

Code Vulnerability Detection Across Different Programming Languages with AI Models

Cryptography and Security

Finds hidden bugs in computer code.

14 Aug 2025 0

View PDF Login to Bookmark

SoK: Understanding (New) Security Issues Across AI4Code Use Cases

Technical Abstract

Security Vulnerabilities in AI-Generated Code: A Large-Scale Analysis of Public GitHub Repositories

Taught by the Flawed: How Dataset Insecurity Breeds Vulnerable AI Code

Code Vulnerability Detection Across Different Programming Languages with AI Models