Neural Uncertainty Principle: A Unified View of Adversarial Fragility and LLM Hallucination

Dong-Xiao Zhang; Hu Lou; Jun-Jie Zhang; Jun Zhu; Deyu Meng

arXiv:2603.19562·cs.LG·March 30, 2026

Neural Uncertainty Principle: A Unified View of Adversarial Fragility and LLM Hallucination

Dong-Xiao Zhang, Hu Lou, Jun-Jie Zhang, Jun Zhu, Deyu Meng

PDF

TL;DR

This paper introduces the Neural Uncertainty Principle (NUP), unifying the understanding of adversarial vulnerability in vision and hallucination in language models through a shared geometric and uncertainty framework.

Contribution

It formalizes a common geometric origin for adversarial fragility and hallucination, and proposes practical methods like ConjMask and LogitReg to improve robustness without adversarial training.

Findings

01

Masking highly coupled input components improves vision robustness.

02

The probe detects hallucination risk before token generation in language models.

03

NUP provides a unified framework for diagnosing and mitigating boundary failures.

Abstract

Adversarial vulnerability in vision and hallucination in large language models are conventionally viewed as separate problems, each addressed with modality-specific patches. This study first reveals that they share a common geometric origin: the input and its loss gradient are conjugate observables subject to an irreducible uncertainty bound. Formalizing a Neural Uncertainty Principle (NUP) under a loss-induced state, we find that in near-bound regimes, further compression must be accompanied by increased sensitivity dispersion (adversarial fragility), while weak prompt-gradient coupling leaves generation under-constrained (hallucination). Crucially, this bound is modulated by an input-gradient correlation channel, captured by a specifically designed single-backward probe. In vision, masking highly coupled components improves robustness without costly adversarial training; in language,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.