Small Agent Can Also Rock! Empowering Small Language Models as Hallucination Detector
Xiaoxue Cheng, Junyi Li, Wayne Xin Zhao, Hongzhi Zhang, Fuzheng Zhang,, Di Zhang, Kun Gai, Ji-Rong Wen

TL;DR
This paper introduces HaluAgent, a framework enabling smaller language models to detect hallucinations across various data types effectively, matching or surpassing GPT-4's performance with minimal tuning.
Contribution
HaluAgent is a novel autonomous agent framework that allows small LLMs to detect hallucinations using a three-stage detection process and a fine-tuning strategy, reducing reliance on large models.
Findings
HaluAgent achieves comparable or better performance than GPT-4 in hallucination detection.
Only 2K samples are needed for effective fine-tuning of small LLMs.
HaluAgent works effectively across multiple languages and data types.
Abstract
Hallucination detection is a challenging task for large language models (LLMs), and existing studies heavily rely on powerful closed-source LLMs such as GPT-4. In this paper, we propose an autonomous LLM-based agent framework, called HaluAgent, which enables relatively smaller LLMs (e.g. Baichuan2-Chat 7B) to actively select suitable tools for detecting multiple hallucination types such as text, code, and mathematical expression. In HaluAgent, we integrate the LLM, multi-functional toolbox, and design a fine-grained three-stage detection framework along with memory mechanism. To facilitate the effectiveness of HaluAgent, we leverage existing Chinese and English datasets to synthesize detection trajectories for fine-tuning, which endows HaluAgent with the capability for bilingual hallucination detection. Extensive experiments demonstrate that only using 2K samples for tuning LLMs,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning in Healthcare
MethodsResidual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
