The HalluRAG Dataset: Detecting Closed-Domain Hallucinations in RAG Applications Using an LLM's Internal States
Fabian Ridder, Malte Schilling

TL;DR
This paper introduces HalluRAG, a dataset for detecting recent, ungrounded hallucinations in large language models at the sentence level, using internal states, with classifiers achieving up to 75% accuracy.
Contribution
The study presents HalluRAG, a novel dataset for training hallucination classifiers based on internal LLM states, and evaluates their effectiveness across different models and configurations.
Findings
Classifiers trained on HalluRAG reach up to 75% accuracy.
Internal states encode distinguishable signals for hallucinated vs. factual sentences.
Answerable and unanswerable prompts are encoded differently, improving detection.
Abstract
Detecting hallucinations in large language models (LLMs) is critical for enhancing their reliability and trustworthiness. Most research focuses on hallucinations as deviations from information seen during training. However, the opaque nature of an LLM's parametric knowledge complicates the understanding of why generated texts appear ungrounded: The LLM might not have picked up the necessary knowledge from large and often inaccessible datasets, or the information might have been changed or contradicted during further training. Our focus is on hallucinations involving information not used in training, which we determine by using recency to ensure the information emerged after a cut-off date. This study investigates these hallucinations by detecting them at sentence level using different internal states of various LLMs. We present HalluRAG, a dataset designed to train classifiers on these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health and Psychiatry · Schizophrenia research and treatment · Psychosomatic Disorders and Their Treatments
MethodsFocus
