On Large Language Models' Hallucination with Regard to Known Facts

Che Jiang; Biqing Qi; Xiangyu Hong; Dayuan Fu; Yang Cheng; Fandong; Meng; Mo Yu; Bowen Zhou; Jie Zhou

arXiv:2403.20009·cs.CL·October 29, 2024·1 cites

On Large Language Models' Hallucination with Regard to Known Facts

Che Jiang, Biqing Qi, Xiangyu Hong, Dayuan Fu, Yang Cheng, Fandong, Meng, Mo Yu, Bowen Zhou, Jie Zhou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates the inference dynamics behind large language model hallucinations, revealing patterns and developing a classifier that predicts hallucinations with 88% accuracy based on token probability dynamics.

Contribution

It introduces a novel analysis of inference dynamics to understand hallucinations and develops a classifier to predict hallucinations accurately.

Findings

01

Hallucinations are linked to lack of abrupt probability increases in token dynamics.

02

A classifier using dynamic probability curves achieves 88% accuracy in detecting hallucinations.

03

Analysis reveals different layer-wise token probability behaviors in hallucinated vs. correct outputs.

Abstract

Large language models are successful in answering factoid questions but are also prone to hallucination. We investigate the phenomenon of LLMs possessing correct answer knowledge yet still hallucinating from the perspective of inference dynamics, an area not previously covered in studies on hallucinations. We are able to conduct this analysis via two key ideas. First, we identify the factual questions that query the same triplet knowledge but result in different answers. The difference between the model behaviors on the correct and incorrect outputs hence suggests the patterns when hallucinations happen. Second, to measure the pattern, we utilize mappings from the residual streams to vocabulary space. We reveal the different dynamics of the output token probabilities along the depths of layers between the correct and hallucinated cases. In hallucinated cases, the output token's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dcdsf321/known_fact_hallucination
jaxOfficial

Videos

On Large Language Models’ Hallucination with Regard to Known Facts· underline

Taxonomy

TopicsBig Data and Digital Economy · Advanced Data Processing Techniques · Machine Learning in Healthcare