ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language   Models

Yuzhe Gu; Ziwei Ji; Wenwei Zhang; Chengqi Lyu; Dahua Lin; Kai Chen

arXiv:2407.04693·cs.CL·December 20, 2024·1 cites

ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models

Yuzhe Gu, Ziwei Ji, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen

PDF

Open Access 1 Repo 1 Models 1 Datasets

TL;DR

This paper presents an iterative self-training framework for scalable and accurate hallucination annotation in large language models, significantly improving detection and mitigation of hallucinations across domains.

Contribution

It introduces a novel EM-based self-training approach that scales hallucination annotation datasets and enhances annotator accuracy, surpassing GPT-4 in detection performance.

Findings

01

The hallucination annotator outperforms GPT-4 in detection accuracy.

02

The framework achieves state-of-the-art results on HaluEval and HalluQA.

03

Mitigates LLM hallucinations, improving NLI scores from 25% to 37%.

Abstract

Large language models (LLMs) exhibit hallucinations in long-form question-answering tasks across various domains and wide applications. Current hallucination detection and mitigation datasets are limited in domains and sizes, which struggle to scale due to prohibitive labor costs and insufficient reliability of existing hallucination annotators. To facilitate the scalable oversight of LLM hallucinations, this paper introduces an iterative self-training framework that simultaneously and progressively scales up the hallucination annotation dataset and improves the accuracy of the hallucination annotator. Based on the Expectation Maximization (EM) algorithm, in each iteration, the framework first applies a hallucination annotation pipeline to annotate a scaled dataset and then trains a more accurate hallucination annotator on the dataset. This new hallucination annotator is adopted in the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

open-compass/anah
pytorchOfficial

Models

🤗
opencompass/anah-v2
model· 24 dl· ♡ 4
24 dl♡ 4

Datasets

opencompass/anah
dataset· 164 dl
164 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMental Health via Writing · Machine Learning in Healthcare

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Softmax · Residual Connection · Byte Pair Encoding · Layer Normalization · Label Smoothing · Adam · Dropout