Beyond Functional Correctness: Exploring Hallucinations in LLM-Generated Code

Fang Liu; Yang Liu; Lin Shi; Zhen Yang; Li Zhang; Xiaoli Lian; Zhongqi Li; Yuchi Ma

arXiv:2404.00971·cs.SE·January 22, 2026·34 cites

Beyond Functional Correctness: Exploring Hallucinations in LLM-Generated Code

Fang Liu, Yang Liu, Lin Shi, Zhen Yang, Li Zhang, Xiaoli Lian, Zhongqi Li, Yuchi Ma

PDF

Open Access

TL;DR

This paper investigates hallucinations in LLM-generated code, categorizing their types, causes, and impacts, and explores prompt-based mitigation techniques to improve code correctness and reliability.

Contribution

It provides the first comprehensive taxonomy of code hallucinations, analyzes their distribution across models and benchmarks, and proposes training-free mitigation methods.

Findings

01

Identified 3 primary categories and 12 specific types of code hallucinations.

02

Analyzed variations of hallucinations among different LLMs and benchmarks.

03

Explored prompt-enhancement techniques for hallucination mitigation.

Abstract

The rise of Large Language Models (LLMs) has significantly advanced various applications on software engineering tasks, particularly in code generation. Despite the promising performance, LLMs are prone to generate hallucinations, which means LLMs might produce outputs that deviate from users' intent, exhibit internal inconsistencies, or misaligned with the real-world knowledge, making the deployment of LLMs potentially risky in a wide range of applications. Existing work mainly focuses on investigating the hallucination in the domain of Natural Language Generation (NLG), leaving a gap in comprehensively understanding the types, causes, and impacts of hallucinations in the context of code generation. To bridge the gap, we conducted a thematic analysis of the LLM-generated code to summarize and categorize the hallucinations, as well as their causes and impacts. Our study established a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLow-power high-performance VLSI design · CCD and CMOS Imaging Sensors · Security and Verification in Computing