Score: 0

GitHub's Copilot Code Review: Can AI Spot Security Flaws Before You Commit?

Published: September 17, 2025 | arXiv ID: 2509.13650v1

By: Amena Amro, Manar H. Alalfi

Potential Business Impact:

AI code checker misses big security problems.

Business Areas:
Artificial Intelligence Artificial Intelligence, Data and Analytics, Science and Engineering, Software

As software development practices increasingly adopt AI-powered tools, ensuring that such tools can support secure coding has become critical. This study evaluates the effectiveness of GitHub Copilot's recently introduced code review feature in detecting security vulnerabilities. Using a curated set of labeled vulnerable code samples drawn from diverse open-source projects spanning multiple programming languages and application domains, we systematically assessed Copilot's ability to identify and provide feedback on common security flaws. Contrary to expectations, our results reveal that Copilot's code review frequently fails to detect critical vulnerabilities such as SQL injection, cross-site scripting (XSS), and insecure deserialization. Instead, its feedback primarily addresses low-severity issues, such as coding style and typographical errors. These findings expose a significant gap between the perceived capabilities of AI-assisted code review and its actual effectiveness in supporting secure development practices. Our results highlight the continued necessity of dedicated security tools and manual code audits to ensure robust software security.

Country of Origin
🇨🇦 Canada

Page Count
8 pages

Category
Computer Science:
Software Engineering