Domain Adaptive Code Completion via Language Models and Decoupled Domain Databases
Ze Tang, Jidong Ge, Shangqing Liu, Tingwei Zhu, Tongtong Xu, Liguo, Huang, Bin Luo

TL;DR
This paper introduces $k$NM-LM, a retrieval-augmented, domain-adaptive code completion method that enhances large language models without fine-tuning, effectively integrating domain knowledge for improved performance across various scenarios.
Contribution
The paper presents a novel retrieval-augmented approach that adapts to different language models and domains without fine-tuning, using Bayesian inference to incorporate domain knowledge.
Findings
$k$NM-LM outperforms CodeGPT and UnixCoder in intra-project and intra-scenario tasks.
The approach operates efficiently with satisfactory speed and storage usage.
It seamlessly integrates with black-box models without requiring access to model parameters.
Abstract
Large Language Models (LLMs) have demonstrated remarkable performance in code completion. However, due to the lack of domain-specific knowledge, they may not be optimal in completing code that requires intensive domain knowledge for example completing the library names. Although there are several works that have confirmed the effectiveness of fine-tuning techniques to adapt language models for code completion in specific domains. They are limited by the need for constant fine-tuning of the model when the project is in constant iteration. To address this limitation, in this paper, we propose NM-LM, a retrieval-augmented language model (R-LM), that integrates domain knowledge into language models without fine-tuning. Different from previous techniques, our approach is able to automatically adapt to different language models and domains. Specifically, it utilizes the in-domain code to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques
