PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

Yuhe Wu; Guangyu Wang; Yuran Chen; Jiatong Zhang; Yutong Zhang; Yujie Chen; Jiaming Shang; Guang Zhang; Zhuang Liu

arXiv:2604.16909·cs.CL·April 28, 2026

PRISM: Probing Reasoning, Instruction, and Source Memory in LLM Hallucinations

Yuhe Wu, Guangyu Wang, Yuran Chen, Jiatong Zhang, Yutong Zhang, Yujie Chen, Jiaming Shang, Guang Zhang, Zhuang Liu

PDF

TL;DR

PRISM is a diagnostic benchmark that dissects LLM hallucinations into four categories across generation stages, enabling detailed evaluation and understanding of hallucination sources in large language models.

Contribution

It introduces PRISM, a comprehensive, stage-aware benchmark with 9,448 instances across 65 tasks for fine-grained hallucination diagnosis in LLMs.

Findings

01

Uncovered trade-offs between instruction following, memory retrieval, and reasoning.

02

Mitigation strategies often improve some hallucination dimensions but worsen others.

03

PRISM enables understanding of specific mechanisms behind LLM hallucinations.

Abstract

As large language models (LLMs) evolve from conversational assistants into agents capable of handling complex tasks, they are increasingly deployed in high-risk domains. However, existing benchmarks largely rely on mixed queries and posterior evaluation, output-level scoring, which quantifies hallucination severity but offers limited insight into where and why hallucinations arise in the generation pipeline. We therefore reformulate hallucination evaluation as a diagnostic problem and propose PRISM, a controlled benchmark that disentangles hallucinations into four dimensions: knowledge missing, knowledge errors, reasoning errors, and instruction-following errors, grounded in three stages of generation (memory, instruction, and reasoning). PRISM contains 9,448 instances across 65 tasks and supports fine-grained, stage-aware diagnostic evaluation. Evaluating 24 mainstream open-source and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.