Strengthening Programming Comprehension in Large Language Models through Code Generation

Xiaoning Ren; Qiang Hu; Wei Ma; Yan Li; Yao Zhang; Lingxiao Jiang; Yinxing Xue

arXiv:2508.12620·cs.SE·August 19, 2025

Strengthening Programming Comprehension in Large Language Models through Code Generation

Xiaoning Ren, Qiang Hu, Wei Ma, Yan Li, Yao Zhang, Lingxiao Jiang, Yinxing Xue

PDF

Open Access

TL;DR

This paper proposes a counterfactual code augmentation and concept-aware tuning framework to improve large language models' understanding of fundamental programming concepts, enhancing their reasoning capabilities for code-related tasks.

Contribution

It introduces a novel augmentation and tuning method to deepen LLMs' grasp of programming concepts, addressing their shallow understanding of data and control flow.

Findings

01

Enhanced performance on code reasoning benchmarks

02

Improved understanding of data and control flow concepts

03

Effective across multiple models and datasets

Abstract

Large language models (LLMs) have recently shown impressive results on diverse code-related tasks, benefiting from large-scale training and instruction tuning. However, studies reveal that their grasp of fundamental programming concepts, such as data flow and control flow, remains shallow, leading to fragile performance when code requires deeper reasoning. This limitation restricts the practical adoption of LLMs in real-world software development. To address this issue, this work introduces a counterfactual code augmentation framework combined with concept-aware tuning, designed to guide LLMs toward stronger conceptual understanding. Comprehensive evaluation across multiple models and benchmarks demonstrates the effectiveness of the proposed approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Online Learning and Analytics · Software Testing and Debugging Techniques