SPARE: Single-Pass Annotation with Reference-Guided Evaluation for Automatic Process Supervision and Reward Modelling
Md Imbesat Hassan Rizvi, Xiaodan Zhu, Iryna Gurevych

TL;DR
SPARE introduces a single-pass, reference-guided framework for efficient, high-quality process annotation in LLMs, improving training and decoding with data efficiency and speed advantages.
Contribution
The paper presents SPARE, a novel structured method for per-step annotation that aligns solutions to references, enabling efficient training and decoding in LLM reasoning tasks.
Findings
SPARE improves process reward modeling and fine-tuning across diverse datasets.
It achieves data-efficient out-of-distribution generalization with only 16% of training samples.
SPARE offers 2.3× speedup over MCTS-based methods while maintaining competitive performance.
Abstract
Process or step-wise supervision has played a crucial role in advancing complex multi-step reasoning capabilities of Large Language Models (LLMs). However, efficient, high-quality automated process annotation remains a significant challenge. To address this, we introduce Single-Pass Annotation with Reference-Guided Evaluation (SPARE), a novel structured framework that enables efficient per-step annotation by jointly aligning solution steps to reference solutions and determine its accuracy with explicit reasoning in single generation. We demonstrate SPARE's effectiveness across four diverse datasets spanning mathematical reasoning (GSM8K, MATH), multi-hop question answering (MuSiQue-Ans), and spatial reasoning (SpaRP), showing consistent improvements in two applications: (1) training Process Reward Models (PRMs) for ranking and aggregating multiple generations, and (2) fine-tuning models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsBusiness Process Modeling and Analysis · Service-Oriented Architecture and Web Services · Semantic Web and Ontologies
