Score: 1

Enhancing LLM Code Generation: A Systematic Evaluation of Multi-Agent Collaboration and Runtime Debugging for Improved Accuracy, Reliability, and Latency

Published: May 4, 2025 | arXiv ID: 2505.02133v1

By: Nazmus Ashrafi, Salah Bouktif, Mohammed Mediani

Potential Business Impact:

Makes computers write better computer programs.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The use of large language models (LLMs) for automated code generation has emerged as a significant focus within AI research. As these pretrained models continue to evolve, their ability to understand and generate complex code structures has opened new possibilities for automating intricate programming tasks for the sake of accurate code generation. Although contemporary foundational models demonstrate promoting results, researchers continue to explore optimal post-training strategies to enhance code quality. These include supervised fine-tuning, retrieval-augmented generation (RAG), debugging, and many others. In this paper, we combine two widely used approaches namely multi-agent collaboration and runtime execution information-based debugging, for improving code generation functionality, reliability, and practical applicability. We perform an empirical study in order to extend the evaluation of the individual strategies as well as the proposed composition of the activities of both strategies. Our study use 19 LLMs to examines the performance of individual and the proposed strategies, offering comprehensive insights into how different programming activities compositions and training paradigms influence code generation effectiveness. In particular, we implement a chained system that combines both strategies to assess their combined impact on functional accuracy, code reliability, and generation latency using two benchmark datasets commonly used for code generation. Our findings provide valuable insights for organizations seeking robust AI-driven coding solutions by guiding them in selecting models that can better adapt to complex post-training strategies, ultimately fostering the adoption of more effective and reliable code generation technologies.

Enhancing LLM-based Quantum Code Generation with Multi-Agent Optimization and Quantum Error Correction

Quantum Physics

AI writes better quantum computer programs.

20 Apr 2025 0

92%

Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks

Artificial Intelligence

Helps computers write better, more complex computer programs.

11 Jan 2025 1

91%

Experiments with Large Language Models on Retrieval-Augmented Generation for Closed-Source Simulation Software

Computation and Language

Helps computers understand secret software programs.

6 Feb 2025 0

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

19 pages

Enhancing LLM Code Generation: A Systematic Evaluation of Multi-Agent Collaboration and Runtime Debugging for Improved Accuracy, Reliability, and Latency

Makes computers write better computer programs.

Technical Abstract

Enhancing LLM-based Quantum Code Generation with Multi-Agent Optimization and Quantum Error Correction

Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks

Experiments with Large Language Models on Retrieval-Augmented Generation for Closed-Source Simulation Software