DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding

Ruiyi Zhang; Peijia Qin; Qi Cao; Pengtao Xie

arXiv:2512.15000·cs.LG·December 18, 2025

DreamPRM-Code: Function-as-Step Process Reward Model with Label Correction for LLM Coding

Ruiyi Zhang, Peijia Qin, Qi Cao, Pengtao Xie

PDF

Open Access

TL;DR

DreamPRM-Code introduces a function-as-step process reward model with label correction, enhancing LLM coding by treating functions as reasoning steps and refining labels through meta-learning, leading to state-of-the-art results.

Contribution

It proposes a novel process reward model that treats code functions as reasoning steps and employs label correction via meta-learning for improved LLM coding performance.

Findings

01

Achieved 80.9 pass@1 on LiveCodeBench, surpassing previous models.

02

Introduced a modular, reasoning-based approach to code generation.

03

Demonstrated effectiveness of label correction in reducing noise.

Abstract

Process Reward Models (PRMs) have become essential for improving Large Language Models (LLMs) via test-time scaling, yet their effectiveness in coding remains limited due to the lack of meaningful step decompositions in code and the noise of Monte-Carlo-generated partial labels. We propose DreamPRM-Code, a coding-focused PRM that treats functions as reasoning steps using a Chain-of-Function prompting strategy to induce modular code generation, enabling PRM training and application analogous to mathematical reasoning tasks. To address label noise, DreamPRM-Code introduces a meta-learning-based correction mechanism that leverages clean final-solution unit-test labels and performs bi-level optimization to refine intermediate labels. Applying on test-time scaling, DreamPRM-Code achieved state-of-the-art performance on LiveCodeBench with 80.9 pass@1 rate, surpassing OpenAI o4-mini.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Topic Modeling · Computational and Text Analysis Methods