GRACE: Graph-Guided Repository-Aware Code Completion through Hierarchical Code Fusion
By: Xingliang Wang , Baoyi Wang , Chen Zhi and more
Potential Business Impact:
Helps computers understand entire code projects.
LLMs excel in localized code completion but struggle with repository-level tasks due to limited context windows and complex semantic and structural dependencies across codebases. While Retrieval-Augmented Generation (RAG) mitigates context scarcity by retrieving relevant code snippets, current approaches face significant limitations. They overly rely on textual similarity for retrieval, neglecting structural relationships such as call chains and inheritance hierarchies, and lose critical structural information by naively concatenating retrieved snippets into text sequences for LLM input. To address these shortcomings, GRACE constructs a multi-level, multi-semantic code graph that unifies file structures, abstract syntax trees, function call graphs, class hierarchies, and data flow graphs to capture both static and dynamic code semantics. For retrieval, GRACE employs a Hybrid Graph Retriever that integrates graph neural network-based structural similarity with textual retrieval, refined by a graph attention network-based re-ranker to prioritize topologically relevant subgraphs. To enhance context, GRACE introduces a structural fusion mechanism that merges retrieved subgraphs with the local code context and preserves essential dependencies like function calls and inheritance. Extensive experiments on public repository-level benchmarks demonstrate that GRACE significantly outperforms state-of-the-art methods across all metrics. Using DeepSeek-V3 as the backbone LLM, GRACE surpasses the strongest graph-based RAG baselines by 8.19% EM and 7.51% ES points on every dataset. The code is available at https://anonymous.4open.science/r/grace_icse-C3D5.
Similar Papers
Completion by Comprehension: Guiding Code Generation with Multi-Granularity Understanding
Software Engineering
Helps computers write better code by understanding its structure.
Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches
Software Engineering
Helps computers write complex software code.
CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation
Software Engineering
Helps computers write complex code by finding examples.