Zero-Direction Probing: A Linear-Algebraic Framework for Deep Analysis of Large-Language-Model Drift
Amit Pandey

TL;DR
Zero-Direction Probing (ZDP) offers a theoretical framework for detecting large-language-model drift by analyzing null directions in transformer activations without relying on task labels or output evaluations.
Contribution
The paper introduces ZDP, a novel theory-only approach with formal guarantees for detecting model drift through null-space analysis of activations.
Findings
Proves the Variance--Leak Theorem and Fisher Null-Conservation.
Derives a Spectral Null-Leakage (SNL) metric with tail bounds.
Provides a-priori thresholds for drift detection under Gaussian null model.
Abstract
We present Zero-Direction Probing (ZDP), a theory-only framework for detecting model drift from null directions of transformer activations without task labels or output evaluations. Under assumptions A1--A6, we prove: (i) the Variance--Leak Theorem, (ii) Fisher Null-Conservation, (iii) a Rank--Leak bound for low-rank updates, and (iv) a logarithmic-regret guarantee for online null-space trackers. We derive a Spectral Null-Leakage (SNL) metric with non-asymptotic tail bounds and a concentration inequality, yielding a-priori thresholds for drift under a Gaussian null model. These results show that monitoring right/left null spaces of layer activations and their Fisher geometry provides concrete, testable guarantees on representational change.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
