FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning

Ruosen Li; Ziming Luo; Xinya Du

arXiv:2410.06304·cs.CL·September 19, 2025

FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning

Ruosen Li, Ziming Luo, Xinya Du

PDF

Open Access 1 Video

TL;DR

This paper introduces FG-PRM, a model that detects and mitigates specific types of hallucinations in LLMs during mathematical reasoning, improving accuracy and interpretability through fine-grained supervision and automated data generation.

Contribution

The paper presents a new taxonomy of hallucinations, a fine-grained detection and mitigation model, and an automated data generation method for training, advancing the reliability of LLM reasoning.

Findings

01

FG-PRM outperforms existing methods in hallucination detection

02

Significantly improves LLM performance on GSM8K and MATH benchmarks

03

Automated data generation reduces manual labeling effort

Abstract

Hallucinations in large language models (LLMs) pose significant challenges in tasks requiring complex multi-step reasoning, such as mathematical problem-solving. Existing approaches primarily detect the presence of hallucinations but lack a nuanced understanding of their types and manifestations. In this paper, we first introduce a comprehensive taxonomy that categorizes the common hallucinations in mathematical reasoning tasks into six types. We then propose FG-PRM (Fine-Grained Process Reward Model), an augmented model designed to detect and mitigate hallucinations in a fine-grained, step-level manner. To address the limitations of manually labeling training data, we propose an automated method for generating fine-grained hallucination data using LLMs. Our FG-PRM demonstrates superior performance across two key tasks: 1) Fine-grained hallucination detection: classifying hallucination…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

FG-PRM: Fine-grained Hallucination Detection and Mitigation in Language Model Mathematical Reasoning· underline

Taxonomy

TopicsBig Data and Digital Economy · Topological and Geometric Data Analysis