AutoHall: Automated Factuality Hallucination Dataset Generation for Large Language Models

Zouying Cao; Yifei Yang; XiaoJing Li; Hai Zhao

arXiv:2310.00259·cs.CL·December 1, 2025·5 cites

AutoHall: Automated Factuality Hallucination Dataset Generation for Large Language Models

Zouying Cao, Yifei Yang, XiaoJing Li, Hai Zhao

PDF

Open Access

TL;DR

AutoHall introduces an automated method to generate model-specific hallucination datasets for large language models, enabling better detection and understanding of hallucinations without extensive manual annotation.

Contribution

The paper presents AutoHall, a novel automated approach to create hallucination datasets tailored to specific models, facilitating improved detection and analysis of hallucinations in LLMs.

Findings

01

Hallucination rates vary across different LLMs.

02

Self-contradiction based detection outperforms baselines.

03

Insights into factors influencing hallucinations.

Abstract

Large language models (LLMs) have gained broad applications across various domains but still struggle with hallucinations. Currently, hallucinations occur frequently in the generation of factual content and pose a great challenge to trustworthy LLMs. However, hallucination detection is hindered by the laborious and expensive manual annotation of hallucinatory content. Meanwhile, as different LLMs exhibit distinct types and rates of hallucination, the collection of hallucination datasets is inherently model-specific, which also increases the cost. To address this issue, this paper proposes a method called $AutoHall$ for $\underline{A u t o}$ matically constructing model-specific $\underline{H a l l}$ ucination datasets based on existing fact-checking datasets. The empirical results reveal variations in hallucination proportions and types among different models. Moreover, we introduce a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Misinformation and Its Impacts