Socratic-PRMBench: Benchmarking Process Reward Models with Systematic Reasoning Patterns
Xiang Li, Haiyang Yu, Xinghua Zhang, Ziyang Huang, Shizhu He, Kang Liu, Jun Zhao, Fei Huang, Yongbin Li

TL;DR
Socratic-PRMBench is a comprehensive benchmark designed to evaluate process reward models across six reasoning patterns, revealing significant weaknesses in current models and aiming to guide future improvements in systematic reasoning evaluation.
Contribution
This paper introduces Socratic-PRMBench, the first benchmark systematically evaluating PRMs under diverse reasoning patterns, filling a critical gap in existing evaluation methods.
Findings
Current PRMs perform poorly across various reasoning patterns.
Existing benchmarks mainly assess stepwise correctness, not reasoning patterns.
Socratic-PRMBench contains 2995 flawed reasoning paths across six patterns.
Abstract
Process Reward Models (PRMs) are crucial in complex reasoning and problem-solving tasks (e.g., LLM agents with long-horizon decision-making) by verifying the correctness of each intermediate reasoning step. In real-world scenarios, LLMs may apply various reasoning patterns (e.g., decomposition) to solve a problem, potentially suffering from errors under various reasoning patterns. Therefore, PRMs are required to identify errors under various reasoning patterns during the reasoning process. However, existing benchmarks mainly focus on evaluating PRMs with stepwise correctness, ignoring a systematic evaluation of PRMs under various reasoning patterns. To mitigate this gap, we introduce Socratic-PRMBench, a new benchmark to evaluate PRMs systematically under six reasoning patterns, including Transformation, Decomposition, Regather, Deduction, Verification, and Integration.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Explainable Artificial Intelligence (XAI) · AI-based Problem Solving and Planning
MethodsFocus
