EigenShield: Causal Subspace Filtering via Random Matrix Theory for Adversarially Robust Vision-Language Models
Nastaran Darabi, Devashri Naik, Sina Tayebati, Dinithi Jayasuriya,, Ranganath Krishnan, Amit Ranjan Trivedi

TL;DR
EigenShield is a novel inference-time defense for vision-language models that uses Random Matrix Theory to detect and filter adversarial noise in high-dimensional representations, improving robustness without retraining.
Contribution
It introduces a spectral analysis-based method leveraging the spiked covariance model to detect adversarial disruptions in VLMs, avoiding costly retraining and architecture modifications.
Findings
EigenShield outperforms existing defenses like adversarial training and UNIGUARD.
It effectively detects adversarial noise using spectral deviations in high-dimensional embeddings.
The method is architecture-independent and attack-agnostic, providing a robust defense mechanism.
Abstract
Vision-Language Models (VLMs) inherit adversarial vulnerabilities of Large Language Models (LLMs), which are further exacerbated by their multimodal nature. Existing defenses, including adversarial training, input transformations, and heuristic detection, are computationally expensive, architecture-dependent, and fragile against adaptive attacks. We introduce EigenShield, an inference-time defense leveraging Random Matrix Theory to quantify adversarial disruptions in high-dimensional VLM representations. Unlike prior methods that rely on empirical heuristics, EigenShield employs the spiked covariance model to detect structured spectral deviations. Using a Robustness-based Nonconformity Score (RbNS) and quantile-based thresholding, it separates causal eigenvectors, which encode semantic information, from correlational eigenvectors that are susceptible to adversarial artifacts. By…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
