Structure and Redundancy in Large Language Models: A Spectral Study via Random Matrix Theory
Davide Ettori

TL;DR
This paper introduces spectral analysis techniques based on Random Matrix Theory to understand, detect failures, and compress large language models, improving reliability and efficiency through novel spectral methods.
Contribution
It presents EigenTrack for real-time detection of model hallucinations and out-of-distribution behavior, and RMT-KD for spectral-based model compression, advancing interpretability and efficiency in large models.
Findings
EigenTrack effectively detects reliability failures early.
RMT-KD produces more compact, energy-efficient models.
Spectral statistics distinguish structured representations from noise.
Abstract
This thesis addresses two persistent and closely related challenges in modern deep learning, reliability and efficiency, through a unified framework grounded in Spectral Geometry and Random Matrix Theory (RMT). As deep networks and large language models continue to scale, their internal behavior becomes increasingly opaque, leading to hallucinations, fragile generalization under distribution shift, and growing computational and energy demands. By analyzing the eigenvalue dynamics of hidden activations across layers and inputs, this work shows that spectral statistics provide a compact, stable, and interpretable lens on model behavior, capable of separating structured, causal representations from noise-dominated variability. Within this framework, the first contribution, EigenTrack, introduces a real-time method for detecting hallucinations and out-of-distribution behavior in large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Digital Economy · Ferroelectric and Negative Capacitance Devices · Stochastic Gradient Optimization Techniques
