Score: 1

Do Prompt Patterns Affect Code Quality? A First Empirical Assessment of ChatGPT-Generated Code

Published: April 18, 2025 | arXiv ID: 2504.13656v1

By: Antonio Della Porta, Stefano Lambiase, Fabio Palomba

Potential Business Impact:

Makes computer code easier to fix and trust.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large Language Models (LLMs) have rapidly transformed software development, especially in code generation. However, their inconsistent performance, prone to hallucinations and quality issues, complicates program comprehension and hinders maintainability. Research indicates that prompt engineering-the practice of designing inputs to direct LLMs toward generating relevant outputs-may help address these challenges. In this regard, researchers have introduced prompt patterns, structured templates intended to guide users in formulating their requests. However, the influence of prompt patterns on code quality has yet to be thoroughly investigated. An improved understanding of this relationship would be essential to advancing our collective knowledge on how to effectively use LLMs for code generation, thereby enhancing their understandability in contemporary software development. This paper empirically investigates the impact of prompt patterns on code quality, specifically maintainability, security, and reliability, using the Dev-GPT dataset. Results show that Zero-Shot prompting is most common, followed by Zero-Shot with Chain-of-Thought and Few-Shot. Analysis of 7583 code files across quality metrics revealed minimal issues, with Kruskal-Wallis tests indicating no significant differences among patterns, suggesting that prompt structure may not substantially impact these quality metrics in ChatGPT-assisted code generation.

Country of Origin
🇮🇹 Italy

Repos / Data Links

Page Count
12 pages

Category
Computer Science:
Software Engineering