Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge

Seongmin Lee; Hsiang Hsu; Chun-Fu Chen; Duen Horng Chau

arXiv:2411.09689·cs.AI·September 17, 2025·3 cites

Probing LLM Hallucination from Within: Perturbation-Driven Approach via Internal Knowledge

Seongmin Lee, Hsiang Hsu, Chun-Fu Chen, Duen Horng Chau

PDF

Open Access

TL;DR

This paper introduces SHINE, a perturbation-driven method for classifying and detecting hallucinations in large language models without external knowledge or fine-tuning, achieving state-of-the-art results.

Contribution

The paper proposes a novel hallucination probing task and a perturbation-based method that improves hallucination detection across multiple LLMs without additional training.

Findings

01

SHINE outperforms seven competing methods in hallucination detection.

02

Perturbing key entities affects hallucination types differently.

03

Effective across three modern LLMs and four datasets.

Abstract

LLM hallucination, where unfaithful text is generated, presents a critical challenge for LLMs' practical applications. Current detection methods often resort to external knowledge, LLM fine-tuning, or supervised training with large hallucination-labeled datasets. Moreover, these approaches do not distinguish between different types of hallucinations, which is crucial for enhancing detection performance. To address such limitations, we introduce hallucination probing, a new task that classifies LLM-generated text into three categories: aligned, misaligned, and fabricated. Driven by our novel discovery that perturbing key entities in prompts affects LLM's generation of these three types of text differently, we propose SHINE, a novel hallucination probing method that does not require external knowledge, supervised training, or LLM fine-tuning. SHINE is effective in hallucination probing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsCryptography and Residue Arithmetic · Logic, Reasoning, and Knowledge · Logic, programming, and type systems