CODEPROMPTZIP: Code-specific Prompt Compression for Retrieval-Augmented Generation in Coding Tasks with LMs

Pengfei He; Shaowei Wang; Tse-Hsun Chen

arXiv:2502.14925·cs.SE·April 13, 2026

CODEPROMPTZIP: Code-specific Prompt Compression for Retrieval-Augmented Generation in Coding Tasks with LMs

Pengfei He, Shaowei Wang, Tse-Hsun Chen

PDF

TL;DR

This paper introduces CodePromptZip, a code-specific prompt compression framework for retrieval-augmented generation in coding tasks, improving prompt efficiency and performance.

Contribution

The paper presents a novel, type-aware, priority-driven code compression method with a trained small language model and copy mechanism, tailored for coding tasks.

Findings

01

CodePromptZip outperforms state-of-the-art baselines in multiple coding tasks.

02

The framework achieves up to 28.7% improvement over baselines.

03

Compression minimizes performance degradation while reducing prompt length.

Abstract

Retrieval-Augmented Generation (RAG) enhances coding tasks by incorporating retrieved code examples into prompts. However, lengthy prompts, often exceeding tens of thousands of tokens, introduce challenges related to limited context windows of language models (LMs) and high computational costs. Existing prompt compression techniques focus on natural language, lacking tailored solutions for code. To address the gap, we propose CodePromptZip, a framework that compresses code examples before integrating into RAG workflows. Our framework employs a type-aware, priority-driven strategy to construct training samples for training code compression model. By using program analysis, we identify token types (e.g., Identifier) and perform ablation analysis to rank their removal priorities based on their impact on task performance. We then train a small LM as the compressor on these samples, enabling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.