Neural Variable Name Repair: Learning to Rename Identifiers for Readability
Muhammad Yousuf, Akshat Bagade, Chhittebbayi Penugonda, and Maanas Baraya

TL;DR
This paper introduces a neural approach to automatically repair and rename misleading variable identifiers in C++ code, significantly improving code readability and understanding by leveraging fine-tuned language models and reranking techniques.
Contribution
The paper presents a novel neural method for variable name repair using fine-tuned Llama models, dual-encoder rerankers, and a new evaluation metric for near-synonym recognition.
Findings
Best model achieves 43.1% exact match in variable repair
Reranker improves selection quality without retraining generator
Partial-match score indicates high potential for near-synonym repairs
Abstract
Developers routinely work with source files whose variable names are generic or misleading, and with teams moving quickly, many functions are left undocumented. This slows comprehension, increases the risk of subtle bugs, and makes it harder for both humans and large language models (LLMs) to reason about code. We study variable name repair: given a real C++ function where all occurrences of one local or parameter name have been replaced by a placeholder (e.g. ID 1), the goal is to generate a natural, descriptive replacement name. We automatically construct this task from the C++ portion of BigCode's The Stack by parsing functions with Tree-sitter, masking a single identifier, and treating the original name as supervision. On top of Llama 3.1-8B, we build a pipeline with (i) warmup and dropout schedules for more stable fine-tuning, (ii) LoRA adapters for efficient specialization on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Data Quality and Management
