Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

Jie Zhu; Yuanchen Zhou; Shuo Jiang; Junhui Li; Lifan Guo; Feng Chen; Chi Zhang

arXiv:2508.15202·cs.CL·May 5, 2026

Fin-PRM: A Domain-Specialized Process Reward Model for Financial Reasoning in Large Language Models

Jie Zhu, Yuanchen Zhou, Shuo Jiang, Junhui Li, Lifan Guo, Feng Chen, Chi Zhang

PDF

1 Repo

TL;DR

Fin-PRM is a specialized process reward model designed for financial reasoning in large language models, improving intermediate step verification and overall reasoning accuracy.

Contribution

It introduces a domain-specific, trajectory-aware reward model trained on a high-quality financial reasoning dataset, enhancing reasoning tasks in finance.

Findings

01

Fin-PRM outperforms general PRMs on financial benchmarks.

02

It improves offline trajectory selection and test-time inference.

03

The model effectively integrates step- and trajectory-level rewards.

Abstract

Process Reward Models (PRMs) supervise intermediate reasoning steps in large language models (LLMs), but existing PRMs are mainly trained on general-domain data and struggle with the structured, symbolic, and fact-sensitive nature of financial reasoning. Financial tasks require not only correct final answers but also verifiable intermediate steps grounded in domain knowledge. In this paper, we propose Fin-PRM, a domain-specialized, trajectory-aware PRM for financial reasoning that jointly models step-level correctness and trajectory-level coherence, producing binary supervision signals for both local and global reasoning quality. To support reliable supervision, we construct a high-quality financial reasoning dataset of 3K trajectories, where step- and trajectory-level labels are automatically derived from multi-source reward signals, including Monte Carlo rollouts, LLM-based…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aliyun/qwen-dianjin
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.