Towards Spectroscopy: Susceptibility Clusters in Language Models
Andrew Gordon, Garrett Baker, George Wang, William Snell, Stan van Wingerden, Daniel Murfet

TL;DR
This paper introduces a spectroscopy-inspired method to analyze neural networks by measuring susceptibilities, revealing interpretable clusters related to language patterns, code, and math, validated against autoencoders.
Contribution
It applies a novel susceptibility-based approach to neural networks, providing a theoretical explanation and empirical clustering of language model features.
Findings
Identified 510 interpretable clusters in Pythia-14M.
Susceptibility clusters align with known linguistic and structural features.
50% of clusters match features from sparse autoencoders.
Abstract
Spectroscopy infers the internal structure of physical systems by measuring their response to perturbations. We apply this principle to neural networks: perturbing the data distribution by upweighting a token in context , we measure the model's response via susceptibilities , which are covariances between component-level observables and the perturbation computed over a localized Gibbs posterior via stochastic gradient Langevin dynamics (SGLD). Theoretically, we show that susceptibilities decompose as a sum over modes of the data distribution, explaining why tokens that follow their contexts "for similar reasons" cluster together in susceptibility space. Empirically, we apply this methodology to Pythia-14M, developing a conductance-based clustering algorithm that identifies 510 interpretable clusters ranging from grammatical patterns to code structure to mathematical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Neural Networks and Reservoir Computing · Advanced Memory and Neural Computing
