Retrieval-augmented code completion for local projects using large language models
Marko Hostnik, Marko Robnik-\v{S}ikonja

TL;DR
This paper explores using small, efficient language models combined with retrieval techniques to improve local code completion, addressing privacy and computational issues of larger models.
Contribution
It introduces retrieval-augmented generation with small LLMs for local code completion and demonstrates its effectiveness over traditional models.
Findings
In-context RAG improves code completion by over 26%.
RETRO enhances GPT-2 performance by 12%.
Proper tokenization is crucial for optimal results.
Abstract
The use of large language models (LLMs) is becoming increasingly widespread among software developers. However, privacy and computational requirements are problematic with commercial solutions and the use of LLMs. In this work, we focus on using relatively small and efficient LLMs with 160M parameters that are suitable for local execution and augmentation with retrieval from local projects. We train two open transformer-based models, the generative GPT-2 and the retrieval-adapted RETRO, on open-source Python files, and empirically compare them, confirming the benefits of embedding-based retrieval. Furthermore, we improve our models' performance with In-context retrieval-augmented generation (RAG), which retrieves code snippets using the Jaccard similarity of tokens. We evaluate In-context RAG on larger models and determine that, despite its simplicity, the approach is more suitable than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Model-Driven Software Engineering Techniques · Software System Performance and Reliability
MethodsLinear Warmup With Linear Decay · Refunds@Expedia|||How do I get a full refund from Expedia? · BERT · BART · RAG · Attention Is All You Need · Linear Layer · Attention Dropout · Residual Connection · Multi-Head Attention
