Triggering Hallucinations in LLMs: A Quantitative Study of   Prompt-Induced Hallucination in Large Language Models

Makoto Sato

arXiv:2505.00557·cs.CL·May 2, 2025

Triggering Hallucinations in LLMs: A Quantitative Study of Prompt-Induced Hallucination in Large Language Models

Makoto Sato

PDF

TL;DR

This paper introduces a systematic prompt-based framework to trigger and measure hallucinations in large language models, revealing their vulnerability and variability across models, which is crucial for developing safer AI systems.

Contribution

The study presents a novel, reproducible method to induce and quantify hallucinations in LLMs, enabling better understanding and mitigation of their factual inaccuracies.

Findings

01

HIPs cause more hallucinations than control prompts

02

Hallucination effects vary across different LLMs

03

Reasoning-oriented models show different hallucination profiles

Abstract

Hallucinations in large language models (LLMs) present a growing challenge across real-world applications, from healthcare to law, where factual reliability is essential. Despite advances in alignment and instruction tuning, LLMs can still generate outputs that are fluent yet fundamentally untrue. Understanding the cognitive dynamics that underlie these hallucinations remains an open problem. In this study, we propose a prompt-based framework to systematically trigger and quantify hallucination: a Hallucination-Inducing Prompt (HIP), which synthetically fuses semantically distant concepts (e.g., periodic table of elements and tarot divination) in a misleading way, and a Hallucination Quantifying Prompt (HQP), which scores the plausibility, confidence, and coherence of the output. Controlled experiments across multiple LLMs revealed that HIPs consistently produced less coherent and more…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.