IaC Generation with LLMs: An Error Taxonomy and A Study on Configuration Knowledge Injection
By: Roman Nekrasov , Stefano Fossati , Indika Kumara and more
Large Language Models (LLMs) currently exhibit low success rates in generating correct and intent-aligned Infrastructure as Code (IaC). This research investigated methods to improve LLM-based IaC generation, specifically for Terraform, by systematically injecting structured configuration knowledge. To facilitate this, an existing IaC-Eval benchmark was significantly enhanced with cloud emulation and automated error analysis. Additionally, a novel error taxonomy for LLM-assisted IaC code generation was developed. A series of knowledge injection techniques was implemented and evaluated, progressing from Naive Retrieval-Augmented Generation (RAG) to more sophisticated Graph RAG approaches. These included semantic enrichment of graph components and modeling inter-resource dependencies. Experimental results demonstrated that while baseline LLM performance was poor (27.1% overall success), injecting structured configuration knowledge increased technical validation success to 75.3% and overall success to 62.6%. Despite these gains in technical correctness, intent alignment plateaued, revealing a "Correctness-Congruence Gap" where LLMs can become proficient "coders" but remain limited "architects" in fulfilling nuanced user intent.
Similar Papers
GenSIaC: Toward Security-Aware Infrastructure-as-Code Generation with Large Language Models
Cryptography and Security
Makes computer code safer from mistakes.
Deployability-Centric Infrastructure-as-Code Generation: An LLM-based Iterative Framework
Software Engineering
Makes computer setups work automatically and correctly.
An Expert-grounded benchmark of General Purpose LLMs in LCA
Computation and Language
AI can help with eco-friendly product checks.