Enhancing Project-Specific Code Completion by Inferring Internal API Information

Le Deng; Xiaoxue Ren; Chao Ni; Ming Liang; David Lo; Zhongxin Liu

arXiv:2507.20888·cs.SE·July 29, 2025

Enhancing Project-Specific Code Completion by Inferring Internal API Information

Le Deng, Xiaoxue Ren, Chao Ni, Ming Liang, David Lo, Zhongxin Liu

PDF

TL;DR

This paper introduces a novel approach for project-specific code completion that infers internal API information without explicit imports, significantly improving accuracy by leveraging a knowledge base and a new benchmark.

Contribution

It proposes a method to infer internal API details without imports and introduces ProjBench, a large-scale benchmark for evaluating code completion methods.

Findings

01

Improves code exact match by 22.72%

02

Enhances identifier exact match by 18.31%

03

Boosts performance when combined with existing baselines

Abstract

Project-specific code completion is a critical task that leverages context from a project to generate accurate code. State-of-the-art methods use retrieval-augmented generation (RAG) with large language models (LLMs) and project information for code completion. However, they often struggle to incorporate internal API information, which is crucial for accuracy, especially when APIs are not explicitly imported in the file. To address this, we propose a method to infer internal API information without relying on imports. Our method extends the representation of APIs by constructing usage examples and semantic descriptions, building a knowledge base for LLMs to generate relevant completions. We also introduce ProjBench, a benchmark that avoids leaked imports and consists of large-scale real-world projects. Experiments on ProjBench and CrossCodeEval show that our approach significantly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.