CodeGrad: Integrating Multi-Step Verification with Gradient-Based LLM Refinement

Yueke Zhang; Yifan Zhang; Kevin Leach; Yu Huang

arXiv:2508.10059·cs.SE·September 4, 2025

CodeGrad: Integrating Multi-Step Verification with Gradient-Based LLM Refinement

Yueke Zhang, Yifan Zhang, Kevin Leach, Yu Huang

PDF

Open Access

TL;DR

CodeGrad presents a novel framework that integrates verification with gradient-based refinement in LLMs to produce more correct, robust, and mathematically justified code solutions, significantly improving performance on benchmark datasets.

Contribution

It introduces a new method that treats code as a differentiable variable, enabling iterative refinement guided by structured feedback and constraints.

Findings

01

Achieves up to 27% improvement on HumanEval

02

Attains 41% relative improvement on LiveCodeBench V6

03

Generates mathematically justified and robust code

Abstract

While Large Language Models (LLMs) have demonstrated remarkable capabilities in code generation, they often produce solutions that lack guarantees of correctness, robustness, and efficiency. This limitation is particularly acute in domains requiring strict constraints. CodeGrad introduces a principled framework that integrates rigorous verification techniques directly into an iterative LLM-based generation loop. It uniquely treats code as a differentiable variable, converting structured feedback and mathematical constraints into a textual pseudo-gradient. This gradient guides the model to iteratively refine solutions, ensuring they are not only functional but also robust and mathematically justified. We evaluate CodeGrad on the HumanEval, HumanEval+, and LiveCodeBench benchmarks. Our implementation outperforms strong baselines, achieving an absolute improvement of up to 27% on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Computational and Text Analysis Methods · Artificial Intelligence in Healthcare and Education