Score: 0

Theoretical Foundations of GPU-Native Compilation for Rapid Code Iteration

Published: December 12, 2025 | arXiv ID: 2512.11200v1

By: Adilet Metinov, Gulida M. Kudakeeva, Gulnara D. Kabaeva

Current AI code generation systems suffer from significant latency bottlenecks due to CPU-GPU data transfers during compilation, execution, and testing phases. We establish theoretical foundations for three complementary approaches to GPU-native compilation that eliminate these transfers: (1) parallel traditional compilation adapted for GPU execution, (2) neural compilation using learned sequence-to-sequence translation with probabilistic verification, and (3) hybrid architectures combining both strategies. We derive latency and energy bounds demonstrating potential speedups of 10-100x for code iteration cycles. Our analysis shows that traditional GPU compilation provides 2-5x improvements through transfer elimination, neural compilation achieves 10-100x speedups via massive parallelism, and hybrid approaches offer practical deployment paths with guaranteed correctness. We formalize the probabilistic verification framework that enables trading compilation accuracy for parallel exploration, and discuss implications for self-improving AI systems and future analog computing substrates.

Hardware-Aware Neural Network Compilation with Learned Optimization: A RISC-V Accelerator Approach

Hardware Architecture

Makes computer chips run faster and use less power.

10 Nov 2025 0

87%

Leveraging Neural Graph Compilers in Machine Learning Research for Edge-Cloud Systems

Distributed, Parallel, and Cluster Computing

Makes AI run faster on different computers.

28 Apr 2025 0

87%

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration

Distributed, Parallel, and Cluster Computing

AI helps computers run same code on different parts.

12 Jun 2025 1

View PDF Login to Bookmark

Theoretical Foundations of GPU-Native Compilation for Rapid Code Iteration

Technical Abstract

Hardware-Aware Neural Network Compilation with Learned Optimization: A RISC-V Accelerator Approach

Leveraging Neural Graph Compilers in Machine Learning Research for Edge-Cloud Systems

HPCTransCompile: An AI Compiler Generated Dataset for High-Performance CUDA Transpilation and LLM Preliminary Exploration