Score: 2

CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion

Published: September 19, 2025 | arXiv ID: 2509.16112v1

By: Sheng Zhang , Yifan Ding , Shuquan Lian and more

Potential Business Impact:

Helps computers write code faster and better.

Business Areas:

Semantic Search Internet Services

Repository-level code completion automatically predicts the unfinished code based on the broader information from the repository. Recent strides in Code Large Language Models (code LLMs) have spurred the development of repository-level code completion methods, yielding promising results. Nevertheless, they suffer from issues such as inappropriate query construction, single-path code retrieval, and misalignment between code retriever and code LLM. To address these problems, we introduce CodeRAG, a framework tailored to identify relevant and necessary knowledge for retrieval-augmented repository-level code completion. Its core components include log probability guided query construction, multi-path code retrieval, and preference-aligned BestFit reranking. Extensive experiments on benchmarks ReccEval and CCEval demonstrate that CodeRAG significantly and consistently outperforms state-of-the-art methods. The implementation of CodeRAG is available at https://github.com/KDEGroup/CodeRAG.

CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation

Software Engineering

Helps computers write complex code by finding examples.

14 Apr 2025 0

92%

Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches

Software Engineering

Helps computers write complex software code.

6 Oct 2025 1

90%

What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond

Software Engineering

Helps computers write better code by finding good examples.

26 Mar 2025 3

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Repos / Data Links

github.com github.com github.com

Page Count

13 pages

CodeRAG: Finding Relevant and Necessary Knowledge for Retrieval-Augmented Repository-Level Code Completion

Helps computers write code faster and better.

Technical Abstract

CodeRAG: Supportive Code Retrieval on Bigraph for Real-World Code Generation

Retrieval-Augmented Code Generation: A Survey with Focus on Repository-Level Approaches

What to Retrieve for Effective Retrieval-Augmented Code Generation? An Empirical Study and Beyond