Do Hallucination Neurons Generalize? Evidence from Cross-Domain Transfer in LLMs

Snehit Vaddi; Pujith Vaddi

arXiv:2604.19765·cs.CL·April 23, 2026

Do Hallucination Neurons Generalize? Evidence from Cross-Domain Transfer in LLMs

Snehit Vaddi, Pujith Vaddi

PDF

TL;DR

This study investigates whether hallucination neurons in large language models generalize across different knowledge domains, finding they do not, which impacts how hallucination detection should be implemented.

Contribution

It provides evidence that hallucination neurons are domain-specific, challenging the idea of a universal neural signature for hallucinations in LLMs.

Findings

01

Hallucination neurons do not generalize well across domains.

02

Classifiers trained on one domain's neurons perform poorly on others.

03

Hallucination mechanisms are domain-dependent, not universal.

Abstract

Recent work identifies a sparse set of "hallucination neurons" (H-neurons), less than 0.1% of feed-forward network neurons, that reliably predict when large language models will hallucinate. These neurons are identified on general-knowledge question answering and shown to generalize to new evaluation instances. We ask a natural follow-up question: do H-neurons generalize across knowledge domains? Using a systematic cross-domain transfer protocol across 6 domains (general QA, legal, financial, science, moral reasoning, and code vulnerability) and 5 open-weight models (3B to 8B parameters), we find they do not. Classifiers trained on one domain's H-neurons achieve AUROC 0.783 within-domain but only 0.563 when transferred to a different domain (delta = 0.220, p < 0.001), a degradation consistent across all models tested. Our results suggest that hallucination is not a single mechanism with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.