Towards Mitigation of Hallucination for LLM-empowered Agents: Progressive Generalization Bound Exploration and Watchdog Monitor
Siyuan Liu, Wenjing Liu, Zhiwei Xu, Xin Wang, Bo Chen, Tao Li

TL;DR
This paper introduces HalMit, a black-box framework that detects hallucinations in LLM-powered agents by modeling their generalization bounds, significantly improving reliability without needing internal model details.
Contribution
The paper proposes a novel black-box watchdog framework, HalMit, utilizing probabilistic fractal sampling to effectively detect hallucinations in LLM-empowered agents.
Findings
HalMit outperforms existing hallucination detection methods.
It effectively models the generalization bounds of LLM agents.
The approach does not require access to internal LLM architecture.
Abstract
Empowered by large language models (LLMs), intelligent agents have become a popular paradigm for interacting with open environments to facilitate AI deployment. However, hallucinations generated by LLMs-where outputs are inconsistent with facts-pose a significant challenge, undermining the credibility of intelligent agents. Only if hallucinations can be mitigated, the intelligent agents can be used in real-world without any catastrophic risk. Therefore, effective detection and mitigation of hallucinations are crucial to ensure the dependability of agents. Unfortunately, the related approaches either depend on white-box access to LLMs or fail to accurately identify hallucinations. To address the challenge posed by hallucinations of intelligent agents, we present HalMit, a novel black-box watchdog framework that models the generalization bound of LLM-empowered agents and thus detect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Computability, Logic, AI Algorithms · Cell Image Analysis Techniques
