Not All Tokens Matter: Data-Centric Optimization for Efficient Code Summarization

Saima Afrin; Zaiyu Cheng; Tushar Sharma; Alexander Serebrenik; Massimiliano Di Penta; Antonio Mastropaolo

arXiv:2601.20147·cs.SE·February 19, 2026

Not All Tokens Matter: Data-Centric Optimization for Efficient Code Summarization

Saima Afrin, Zaiyu Cheng, Tushar Sharma, Alexander Serebrenik, Massimiliano Di Penta, Antonio Mastropaolo

PDF

Open Access

TL;DR

This paper systematically evaluates how system prompts, model scale, prompting strategy, and programming language influence the performance of instruction-tuned language models and code language models in code generation tasks.

Contribution

It provides a comprehensive analysis of the effects of system prompts on code generation performance across various model sizes, prompting strategies, and programming languages.

Findings

01

System prompt influence increases with model scale.

02

Few-shot prompting reduces prompt sensitivity.

03

Java models are more sensitive to prompt variations than Python.

Abstract

Instruction-tuned Language Models ILMs have become essential components of modern AI systems, demonstrating exceptional versatility across a wide range of natural language and reasoning tasks. Among their most impactful applications is code generation, where ILMs--commonly referred to as Code Language Models CLMs--have demonstrated remarkable capability. This strength stems from their defining feature: the use of explicit task instructions during fine-tuning, which enables them to bridge natural language and code by translating human intent into executable code. While much of their progress has been driven by advances in scaling laws and training methodologies, one critical aspect remains underexplored--the impact of system prompts on the performance of both general-purpose ILMs and specialized CLMs when instantiated to assist users with code generation activities. In this study, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and Data Classification · Natural Language Processing Techniques