Zero-Direction Probing: A Linear-Algebraic Framework for Deep Analysis of Large-Language-Model Drift

Amit Pandey

arXiv:2508.06776·cs.LG·August 12, 2025

Zero-Direction Probing: A Linear-Algebraic Framework for Deep Analysis of Large-Language-Model Drift

Amit Pandey

PDF

Open Access

TL;DR

Zero-Direction Probing (ZDP) offers a theoretical framework for detecting large-language-model drift by analyzing null directions in transformer activations without relying on task labels or output evaluations.

Contribution

The paper introduces ZDP, a novel theory-only approach with formal guarantees for detecting model drift through null-space analysis of activations.

Findings

01

Proves the Variance--Leak Theorem and Fisher Null-Conservation.

02

Derives a Spectral Null-Leakage (SNL) metric with tail bounds.

03

Provides a-priori thresholds for drift detection under Gaussian null model.

Abstract

We present Zero-Direction Probing (ZDP), a theory-only framework for detecting model drift from null directions of transformer activations without task labels or output evaluations. Under assumptions A1--A6, we prove: (i) the Variance--Leak Theorem, (ii) Fisher Null-Conservation, (iii) a Rank--Leak bound for low-rank updates, and (iv) a logarithmic-regret guarantee for online null-space trackers. We derive a Spectral Null-Leakage (SNL) metric with non-asymptotic tail bounds and a concentration inequality, yielding a-priori thresholds for drift under a Gaussian null model. These results show that monitoring right/left null spaces of layer activations and their Fisher geometry provides concrete, testable guarantees on representational change.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning