Anchoring the Eigengap: Cross-Modal Spectral Stabilization for Sample-Efficient Representation Learning

Nikhil J. Dhinagar; Vidhi Chhatbar; Chirag Jagad; Pavithra Senthilkumar; Sophia I. Thomopoulos; Mahir H. Khan; Sook-Lei Liew; the ENIGMA-Stroke Recovery Working Group; Paul M. Thompson

arXiv:2605.08764·cs.LG·May 12, 2026

Anchoring the Eigengap: Cross-Modal Spectral Stabilization for Sample-Efficient Representation Learning

Nikhil J. Dhinagar, Vidhi Chhatbar, Chirag Jagad, Pavithra Senthilkumar, Sophia I. Thomopoulos, Mahir H. Khan, Sook-Lei Liew, the ENIGMA-Stroke Recovery Working Group, Paul M. Thompson

PDF

TL;DR

This paper introduces a spectral theory framework for understanding and improving low-data representation learning, especially in medical imaging, by stabilizing eigengaps through multimodal learning and spectral filtering.

Contribution

It develops a theory linking eigenvalue decay to data efficiency, proposing spectral stabilization via multimodal learning to enhance low-data model performance.

Findings

01

Multimodal learning preserves more stable spectral modes in low-data regimes.

02

Spectral collapse limits the number of recoverable signal modes, affecting classification.

03

Zeta-based spectral filtering improves data efficiency and model stability.

Abstract

Deep vision models degrade sharply in low-data regimes, particularly in medical imaging where labeled samples are scarce. We show this arises not merely from overfitting but from a geometric failure: finite-sample noise corrupts the embedding covariance, collapsing the eigengap and limiting the number of recoverable signal-bearing modes. We develop a spectral theory of finite-sample representation learning that quantifies the recoverable dimension K(N), the number of eigenmodes that can be stably estimated from N samples. Using perturbation theory and concentration bounds, we show that only modes with eigenvalues above the noise floor $∥ \hat{Σ} - Σ ∥_{op} \sim D / N$ are reliable, yielding a truncated Mahalanobis energy that governs classification performance. Under a power-law spectral model, this energy can be approximated by a truncated Riemann zeta function,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.