Halu-J: Critique-Based Hallucination Judge
Binjie Wang, Steffi Chern, Ethan Chern, Pengfei Liu

TL;DR
Halu-J is a critique-based model with 7 billion parameters designed to improve hallucination detection in large language models by selecting relevant evidence and providing detailed critiques, outperforming existing methods.
Contribution
The paper introduces Halu-J, a novel critique-based hallucination detection model that enhances evidence relevance assessment and critique generation, addressing limitations of previous retrieval-based approaches.
Findings
Halu-J outperforms GPT-4o in multiple-evidence hallucination detection.
Halu-J matches GPT-4o in critique generation and evidence selection.
Introduction of ME-FEVER dataset for multiple-evidence hallucination detection.
Abstract
Large language models (LLMs) frequently generate non-factual content, known as hallucinations. Existing retrieval-augmented-based hallucination detection approaches typically address this by framing it as a classification task, evaluating hallucinations based on their consistency with retrieved evidence. However, this approach usually lacks detailed explanations for these evaluations and does not assess the reliability of these explanations. Furthermore, deficiencies in retrieval systems can lead to irrelevant or partially relevant evidence retrieval, impairing the detection process. Moreover, while real-world hallucination detection requires analyzing multiple pieces of evidence, current systems usually treat all evidence uniformly without considering its relevance to the content. To address these challenges, we introduce Halu-J, a critique-based hallucination judge with 7 billion…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMental Health and Psychiatry
