Dynamic Scaling of Unit Tests for Code Reward Modeling
Zeyao Ma, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang

TL;DR
This paper introduces a dynamic scaling approach for unit tests in code reward modeling, improving large language model performance by adaptively increasing test quantity based on problem difficulty.
Contribution
It proposes CodeRM-8B, a lightweight unit test generator, and a dynamic scaling mechanism that adaptively adjusts the number of unit tests to enhance reward signals and model performance.
Findings
Scaling unit tests improves reward signal quality.
Dynamic scaling yields significant performance gains on benchmarks.
More benefits are observed on more challenging problems.
Abstract
Current large language models (LLMs) often struggle to produce accurate responses on the first attempt for complex reasoning tasks like code generation. Prior research tackles this challenge by generating multiple candidate solutions and validating them with LLM-generated unit tests. The execution results of unit tests serve as reward signals to identify correct solutions. As LLMs always confidently make mistakes, these unit tests are not reliable, thereby diminishing the quality of reward signals. Motivated by the observation that scaling the number of solutions improves LLM performance, we explore the impact of scaling unit tests to enhance reward signal quality. Our pioneer experiment reveals a positive correlation between the number of unit tests and reward signal quality, with greater benefits observed in more challenging problems. Based on these insights, we propose CodeRM-8B, a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Model-Driven Software Engineering Techniques
