CRPE: Expanding The Reasoning Capability of Large Language Model for Code Generation

Ningxin Gui; Qianghuai Jia; Feijun Jiang; Yuling Jiao; dechun wang; Jerry Zhijian Yang

arXiv:2505.10594·cs.SE·May 19, 2025

CRPE: Expanding The Reasoning Capability of Large Language Model for Code Generation

Ningxin Gui, Qianghuai Jia, Feijun Jiang, Yuling Jiao, dechun wang, Jerry Zhijian Yang

PDF

Open Access

TL;DR

CRPE is a novel three-stage framework that significantly enhances the reasoning capabilities of large language models for code generation, leading to state-of-the-art performance on benchmark datasets.

Contribution

The paper introduces CRPE, a comprehensive and open-source framework for improving code reasoning in LLMs through data synthesis and training, achieving superior results.

Findings

01

COT-Coder-7B-StepDPO achieves pass@1 of 21.88, surpassing similar models.

02

COT-Coder-32B-StepDPO achieves pass@1 of 35.08, outperforming GPT4O.

03

CRPE improves code generation accuracy and reasoning abilities.

Abstract

We introduce CRPE (Code Reasoning Process Enhancer), an innovative three-stage framework for data synthesis and model training that advances the development of sophisticated code reasoning capabilities in large language models (LLMs). Building upon existing system-1 models, CRPE addresses the fundamental challenge of enhancing LLMs' analytical and logical processing in code generation tasks. Our framework presents a methodologically rigorous yet implementable approach to cultivating advanced code reasoning abilities in language models. Through the implementation of CRPE, we successfully develop an enhanced COT-Coder that demonstrates marked improvements in code generation tasks. Evaluation results on LiveCodeBench (20240701-20240901) demonstrate that our COT-Coder-7B-StepDPO, derived from Qwen2.5-Coder-7B-Base, with a pass@1 accuracy of 21.88, exceeds all models with similar or even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Materials Science · Natural Language Processing Techniques