HalluZig: Hallucination Detection using Zigzag Persistence
Shreyas N. Samaga, Gilberto Gonzalez Arroyo, Tamal K. Dey

TL;DR
HalluZig introduces a topological data analysis approach using zigzag persistence to detect hallucinations in LLMs by analyzing internal attention dynamics, outperforming existing methods.
Contribution
This paper presents a novel hallucination detection method based on topological signatures of attention evolution, a new paradigm in LLM reliability assessment.
Findings
HalluZig outperforms strong baselines on multiple benchmarks.
Topological signatures are consistent across different models.
Structural signatures from partial network depth suffice for detection.
Abstract
The factual reliability of Large Language Models (LLMs) remains a critical barrier to their adoption in high-stakes domains due to their propensity to hallucinate. Current detection methods often rely on surface-level signals from the model's output, overlooking the failures that occur within the model's internal reasoning process. In this paper, we introduce a new paradigm for hallucination detection by analyzing the dynamic topology of the evolution of model's layer-wise attention. We model the sequence of attention matrices as a zigzag graph filtration and use zigzag persistence, a tool from Topological Data Analysis, to extract a topological signature. Our core hypothesis is that factual and hallucinated generations exhibit distinct topological signatures. We validate our framework, HalluZig, on multiple benchmarks, demonstrating that it outperforms strong baselines. Furthermore,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopological and Geometric Data Analysis · Advanced Graph Neural Networks · Explainable Artificial Intelligence (XAI)
