ConceptCoder: Improve Code Reasoning via Concept Learning

Md Mahbubur Rahman; Hengbo Tong; Wei Le

arXiv:2603.23470·cs.SE·March 25, 2026

ConceptCoder: Improve Code Reasoning via Concept Learning

Md Mahbubur Rahman, Hengbo Tong, Wei Le

PDF

Open Access

TL;DR

ConceptCoder enhances code reasoning in large language models by training them to recognize and reason with human-understandable code concepts, significantly improving vulnerability detection accuracy and generalizing to other tasks.

Contribution

This work is the first to define and incorporate human-understandable code concepts into LLM fine-tuning for improved code reasoning tasks.

Findings

01

VD accuracy improved from 66.32 to 72.15 F1

02

Outperforms state-of-the-art baselines including GPT-5.2 and Claude-Opus-4.5

03

Concept-based fine-tuning generalizes to other tasks like branch prediction

Abstract

Large language models (LLMs) have shown promising results for software engineering applications, but still struggle with code reasoning tasks such as vulnerability detection (VD). We introduce ConceptCoder, a fine-tuning method that simulates human code inspection: models are trained to first recognize code concepts and then perform reasoning on top of these concepts. In prior work, concepts are extracted by multimodal models or LLMs to explain vision and natural language models. Our work is the first to formulate concepts for code. We define code concepts as human-understandable semantic properties of code and train models to learn such concepts. Our evaluation shows that this approach significantly improves VD accuracy, from 66.32 to 72.15 F1 on average over 9 open-source LLMs. ConceptCoder achieves the best VD performance compared to state-of-the-art (SOTA) baselines, including…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Information and Cyber Security