When Answers Stray from Questions: Hallucination Detection via Question-Answer Orthogonal Decomposition

Siyang Yao; Erhu Feng; Yubin Xia

arXiv:2605.14449·cs.LG·May 15, 2026

When Answers Stray from Questions: Hallucination Detection via Question-Answer Orthogonal Decomposition

Siyang Yao, Erhu Feng, Yubin Xia

PDF

TL;DR

This paper introduces QAOD, a single-pass method for hallucination detection in large language models that improves accuracy and robustness by orthogonalizing answer representations relative to questions.

Contribution

QAOD is a novel framework that projects answer representations away from question-aligned directions, enhancing hallucination detection and domain transfer in LLMs.

Findings

01

QAOD achieves the best in-domain AUROC across multiple datasets.

02

Orthogonal-only probe surpasses white-box baselines in out-of-domain transfer.

03

QAOD reduces detection cost to under 25% of generation cost.

Abstract

Hallucination detection in large language models (LLMs) requires balancing accu racy, efficiency, and robustness to distribution shift. Black-box consistency methods are effective but demand repeated inference; single-pass white-box probes are effi cient yet treat answer representations in isolation, often degrading sharply under domain shift. We propose QAOD (Question-Answer Orthogonal Decomposition), a single-pass framework that projects away the question-aligned direction from the answer representation to obtain a question-orthogonal component that suppresses domain-conditioned variation. To identify informative signals, QAOD further selects layers via diversity-penalized Fisher scoring and discriminative neurons via Fisher importance. To address both in-domain detection and cross-domain generalization, we design two complementary probing strategies: pairing the or thogonal component…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.