Score: 1

LLavaCode: Compressed Code Representations for Retrieval-Augmented Code Generation

Published: October 22, 2025 | arXiv ID: 2510.19644v1

By: Daria Cherniuk , Nikita Sukhorukov , Nikita Sushko and more

Potential Business Impact:

Makes code writing faster and smarter.

Business Areas:

Semantic Search Internet Services

Retrieval-augmented generation has emerged as one of the most effective approaches for code completion, particularly when context from a surrounding repository is essential. However, incorporating context significantly extends sequence length, leading to slower inference - a critical limitation for interactive settings such as IDEs. In this work, we introduce LlavaCode, a framework that compresses code into compact, semantically rich representations interpretable by code LLM, enhancing generation quality while reducing the retrieved context to only a few compressed single-token vectors. Using a small projector module we can significantly increase the EM and ES metrics of coding model with negligible latency increase. Our experiments demonstrate that compressed context enables 20-38% reduction in Time-to-First-Token (TTFT) on line completion tasks compared to full-RAG pipelines.

ReCode: Improving LLM-based Code Repair with Fine-Grained Retrieval-Augmented Generation

Software Engineering

Fixes computer code faster and cheaper.

2 Sep 2025 0

89%

LongCodeZip: Compress Long Context for Code Language Models

Computation and Language

Makes computer programs understand more code faster.

1 Oct 2025 3

89%

Completion by Comprehension: Guiding Code Generation with Multi-Granularity Understanding

Software Engineering

Helps computers write better code by understanding its structure.

4 Dec 2025 1

View PDF Login to Bookmark

Country of Origin

🇷🇺 Russian Federation

Repos / Data Links

github.com

Page Count

13 pages

LLavaCode: Compressed Code Representations for Retrieval-Augmented Code Generation

Makes code writing faster and smarter.

Technical Abstract

ReCode: Improving LLM-based Code Repair with Fine-Grained Retrieval-Augmented Generation

LongCodeZip: Compress Long Context for Code Language Models

Completion by Comprehension: Guiding Code Generation with Multi-Granularity Understanding