Evaluating and Achieving Controllable Code Completion in Code LLM

Jiajun Zhang; Zeyu Cui; Lei Zhang; Jian Yang; Jiaxi Yang; Qiang Liu; Zilei Wang; Binyuan Hui; Liang Wang; Junyang Lin

arXiv:2601.15879·cs.SE·January 23, 2026

Evaluating and Achieving Controllable Code Completion in Code LLM

Jiajun Zhang, Zeyu Cui, Lei Zhang, Jian Yang, Jiaxi Yang, Qiang Liu, Zilei Wang, Binyuan Hui, Liang Wang, Junyang Lin

PDF

Open Access

TL;DR

This paper introduces a new benchmark for instruction-guided code completion, evaluates numerous LLMs, and develops a fine-tuned model that outperforms existing ones, highlighting gaps and future directions in code LLMs.

Contribution

It presents the first instruction-guided code completion benchmark, a data synthesis pipeline for fine-tuning, and a new model achieving state-of-the-art results.

Findings

01

Open-source models lag behind proprietary ones in instruction-following.

02

The new benchmark reveals significant gaps in instruction-following capabilities.

03

Fine-tuning with synthesized data improves code completion performance.

Abstract

Code completion has become a central task, gaining significant attention with the rise of large language model (LLM)-based tools in software engineering. Although recent advances have greatly improved LLMs' code completion abilities, evaluation methods have not advanced equally. Most current benchmarks focus solely on functional correctness of code completions based on given context, overlooking models' ability to follow user instructions during completion-a common scenario in LLM-assisted programming. To address this limitation, we present the first instruction-guided code completion benchmark, Controllable Code Completion Benchmark (C3-Bench), comprising 2,195 carefully designed completion tasks. Through comprehensive evaluation of over 40 mainstream LLMs across C3-Bench and conventional benchmarks, we reveal substantial gaps in instruction-following capabilities between open-source…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Engineering Techniques and Practices · Software Testing and Debugging Techniques