Knowledge Overshadowing Causes Amalgamated Hallucination in Large   Language Models

Yuji Zhang; Sha Li; Jiateng Liu; Pengfei Yu; Yi R. Fung; Jing Li,; Manling Li; Heng Ji

arXiv:2407.08039·cs.CL·July 12, 2024·5 cites

Knowledge Overshadowing Causes Amalgamated Hallucination in Large Language Models

Yuji Zhang, Sha Li, Jiateng Liu, Pengfei Yu, Yi R. Fung, Jing Li,, Manling Li, Heng Ji

PDF

Open Access

TL;DR

This paper investigates how knowledge overshadowing causes hallucinations in large language models, revealing data imbalance as a key factor, and proposes methods to detect and reduce such hallucinations effectively.

Contribution

The study introduces the concept of knowledge overshadowing, analyzes its causes, and presents a training-free decoding method to mitigate hallucinations in LLMs.

Findings

01

Hallucination rate increases with data imbalance and condition length.

02

Knowledge overshadowing can be predicted using a self-contrastive decoding approach.

03

Proposed method achieves up to 82% F1 in hallucination anticipation.

Abstract

Hallucination is often regarded as a major impediment for using large language models (LLMs), especially for knowledge-intensive tasks. Even when the training corpus consists solely of true statements, language models still generate hallucinations in the form of amalgamations of multiple facts. We coin this phenomenon as ``knowledge overshadowing'': when we query knowledge from a language model with multiple conditions, some conditions overshadow others, leading to hallucinated outputs. This phenomenon partially stems from training data imbalance, which we verify on both pretrained models and fine-tuned models, over a wide range of LM model families and sizes.From a theoretical point of view, knowledge overshadowing can be interpreted as over-generalization of the dominant conditions (patterns). We show that the hallucination rate grows with both the imbalance ratio (between the popular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare