Retrieval-Augmented Process Reward Model for Generalizable Mathematical   Reasoning

Jiachen Zhu; Congmin Zheng; Jianghao Lin; Kounianhua Du; Ying Wen,; Yong Yu; Jun Wang; Weinan Zhang

arXiv:2502.14361·cs.AI·February 21, 2025

Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning

Jiachen Zhu, Congmin Zheng, Jianghao Lin, Kounianhua Du, Ying Wen,, Yong Yu, Jun Wang, Weinan Zhang

PDF

Open Access 1 Video

TL;DR

This paper introduces RetrievalPRM, a retrieval-augmented framework that significantly improves the generalization and accuracy of process reward models in mathematical reasoning, especially on out-of-distribution problems.

Contribution

The paper proposes RetrievalPRM, a novel retrieval-augmented approach that enhances process reward models' ability to handle out-of-distribution reasoning tasks in mathematics.

Findings

01

RetrievalPRM outperforms existing baselines on multiple datasets.

02

The framework improves reasoning consistency across different models.

03

Open-source dataset and tools support further research.

Abstract

While large language models (LLMs) have significantly advanced mathematical reasoning, Process Reward Models (PRMs) have been developed to evaluate the logical validity of reasoning steps. However, PRMs still struggle with out-of-distribution (OOD) challenges. This paper identifies key OOD issues, including step OOD, caused by differences in reasoning patterns across model types and sizes, and question OOD, which arises from dataset shifts between training data and real-world problems. To address these issues, we introduce Retrieval-Augmented Process Reward Model (RetrievalPRM), a novel framework designed to tackle these OOD issues. By utilizing a two-stage retrieval-enhanced mechanism, RetrievalPRM retrieves semantically similar questions and steps as a warmup, enhancing PRM's ability to evaluate target steps and improving generalization and reasoning consistency across different…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Retrieval-Augmented Process Reward Model for Generalizable Mathematical Reasoning· underline

Taxonomy

TopicsMulti-Criteria Decision Making · Statistical and Computational Modeling · AI-based Problem Solving and Planning