ANAH: Analytical Annotation of Hallucinations in Large Language Models
Ziwei Ji, Yuzhe Gu, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen

TL;DR
ANAH is a bilingual dataset with detailed annotations of hallucinations in LLM-generated answers, enabling better measurement, training, and evaluation of hallucination detection and correction methods.
Contribution
The paper introduces ANAH, a comprehensive dataset with fine-grained hallucination annotations for LLM answers, and demonstrates its effectiveness in training and evaluating hallucination annotators.
Findings
Generative annotators trained on ANAH outperform open-source LLMs.
ANAH enables training models that approach GPT-4's performance.
Fine-grained annotations help understand hallucination accumulation in LLMs.
Abstract
Reducing the `' problem of Large Language Models (LLMs) is crucial for their wide applications. A comprehensive and fine-grained measurement of the hallucination is the first key step for the governance of this issue but is under-explored in the community. Thus, we present , a bilingual dataset that offers alytical nnotation of allucinations in LLMs within Generative Question Answering. Each answer sentence in our dataset undergoes rigorous annotation, involving the retrieval of a reference fragment, the judgment of the hallucination type, and the correction of hallucinated content. ANAH consists of ~12k sentence-level annotations for ~4.3k LLM responses covering over 700 topics, constructed by a human-in-the-loop pipeline. Thanks to the fine granularity of the hallucination annotations, we can quantitatively…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Anomaly Detection Techniques and Applications
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Label Smoothing · Adam · Position-Wise Feed-Forward Layer · Dropout · Dense Connections · Absolute Position Encodings · Softmax
