How Does Chunking Affect Retrieval-Augmented Code Completion? A Controlled Empirical Study
Xinjian Wu, Jingzhi Gong, Gunel Jahangirova, Jie Zhang

TL;DR
This study empirically examines how different chunking strategies impact retrieval-augmented code completion, revealing that function-based chunking underperforms and cross-file context length is a key factor.
Contribution
It provides the first controlled empirical analysis of chunking strategies in RAG-based code completion, highlighting the importance of context length over chunking method.
Findings
Function chunking underperforms other strategies by 3.57--5.64 percentage points.
Doubling context length from 2,048 to 8,192 tokens improves performance by up to 4.2 percentage points.
Sliding Window and cAST strategies dominate the cost--quality Pareto front.
Abstract
Retrieval-augmented generation (RAG) pipelines for code completion rely on chunking to segment source files into retrievable units, yet chunking strategies are typically adopted without empirical justification, and practitioner recommendations are notably inconsistent. We present a controlled empirical study isolating the effect of chunking on code completion quality by crossing four representative strategies (Function, Declaration, Sliding Window, and cAST) with four retrievers, five generators, and nine parameter configurations on two benchmarks (RepoEval and CrossCodeEval), totaling 864 experimental settings. Our results reveal that chunking strategy has a statistically significant effect on RAG-based code completion. Contrary to intuition, chunking based on functions underperforms all other strategies by 3.57--5.64 percentage points on RepoEval (Cliff's delta = -1.0), while the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
