HalluEntity: Benchmarking and Understanding Entity-Level Hallucination Detection

Min-Hsuan Yeh; Max Kamachee; Seongheon Park; Yixuan Li

arXiv:2502.11948·cs.CL·September 5, 2025

HalluEntity: Benchmarking and Understanding Entity-Level Hallucination Detection

Min-Hsuan Yeh, Max Kamachee, Seongheon Park, Yixuan Li

PDF

Open Access 1 Datasets

TL;DR

This paper introduces HalluEntity, a new dataset for entity-level hallucination detection in LLMs, and evaluates existing uncertainty-based methods, revealing their limitations and guiding future research directions.

Contribution

The paper presents the first entity-level hallucination detection dataset and a comprehensive evaluation of uncertainty-based detection methods across multiple LLMs.

Findings

01

Token probability-based methods over-predict hallucinations.

02

Context-aware methods perform better but are still suboptimal.

03

Linguistic properties influence hallucination tendencies.

Abstract

To mitigate the impact of hallucination nature of LLMs, many studies propose detecting hallucinated generation through uncertainty estimation. However, these approaches predominantly operate at the sentence or paragraph level, failing to pinpoint specific spans or entities responsible for hallucinated content. This lack of granularity is especially problematic for long-form outputs that mix accurate and fabricated information. To address this limitation, we explore entity-level hallucination detection. We propose a new data set, HalluEntity, which annotates hallucination at the entity level. Based on the dataset, we comprehensively evaluate uncertainty-based hallucination detection approaches across 17 modern LLMs. Our experimental results show that uncertainty estimation approaches focusing on individual token probabilities tend to over-predict hallucinations, while context-aware…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

samuelyeh/HalluEntity
dataset· 20 dl
20 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMental Health Research Topics