LANCET: Neural Intervention via Structural Entropy for Mitigating Faithfulness Hallucinations in LLMs
Chenxu Wang, Chaozhuo Li, Pengbo Wang, Litian Zhang, Songyang Liu, Ji Qi, Jiahui Hu, Yushan Cai, Hao Zhao, Rui Pu

TL;DR
Lancet introduces a structural entropy-based neural intervention framework to precisely identify and block pathways causing hallucinations in large language models, significantly improving faithfulness without impairing overall performance.
Contribution
The paper presents Lancet, a novel method that leverages structural entropy and gradient analysis for targeted neural intervention to reduce hallucinations in LLMs.
Findings
Lancet outperforms existing methods on hallucination benchmarks.
The intervention preserves the general capabilities of the model.
Structural entropy effectively identifies propagation pathways of hallucinations.
Abstract
Large Language Models have revolutionized information processing, yet their reliability is severely compromised by faithfulness hallucinations. While current approaches attempt to mitigate this issue through node-level adjustments or coarse suppression, they often overlook the distributed nature of neural information, leading to imprecise interventions. Recognizing that hallucinations propagate through specific forward transmission pathways like an infection, we aim to surgically block this flow using precise structural analysis. To leverage this, we propose Lancet, a novel framework that achieves precise neural intervention by leveraging structural entropy and hallucination difference ratios. Lancet first locates hallucination-prone neurons via gradient-driven contrastive analysis, then maps their propagation pathways by minimizing structural entropy, and finally implements a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Ferroelectric and Negative Capacitance Devices · Domain Adaptation and Few-Shot Learning
