Training-Free Spectral Fingerprints of Voice Processing in Transformers

Valentin No\"el

arXiv:2510.19131·cs.CL·October 23, 2025

Training-Free Spectral Fingerprints of Voice Processing in Transformers

Valentin No\"el

PDF

Open Access 3 Reviews

TL;DR

This paper introduces a training-free spectral analysis method to detect architectural signatures and biases in transformer models' voice processing, revealing how training emphasis influences connectivity patterns during linguistic tasks.

Contribution

It presents a novel spectral fingerprinting framework using graph signal processing to analyze transformer attention structures without training, uncovering model-specific and language-specific signatures.

Findings

01

Phi-3-Mini shows English-specific early layer disruption.

02

Spectral signatures correlate strongly with behavioral differences.

03

Attention head ablations confirm the functional relevance of spectral effects.

Abstract

Different transformer architectures implement identical linguistic computations via distinct connectivity patterns, yielding model imprinted ``computational fingerprints'' detectable through spectral analysis. Using graph signal processing on attention induced token graphs, we track changes in algebraic connectivity (Fiedler value, $Δ λ_{2}$ ) under voice alternation across 20 languages and three model families, with a prespecified early window (layers 2--5). Our analysis uncovers clear architectural signatures: Phi-3-Mini shows a dramatic English specific early layer disruption ( $\overline{Δ λ_{2}}_{[2, 5]} \approx - 0.446$ ) while effects in 19 other languages are minimal, consistent with public documentation that positions the model primarily for English use. Qwen2.5-7B displays small, distributed shifts that are largest for morphologically rich languages, and…

Peer Reviews

Decision·ICLR 2026 Conference Withdrawn Submission

Reviewer 01Rating 2Confidence 2

Strengths

1. The paper devises a new metric to interpret model representations and linguistic processing, i.e. the Fiedler value of the transformed post-softmax attention. 2. The paper conducts rigorous statistical tests over 3 different models and several tasks to investigate evidence for generalization. 3. The hallucination results, restated from another paper are cool. Nice!

Weaknesses

0. The figure texts are minuscule. Please increase the text sizes in all your figures. It's incredibly hard to read and interpret your figures. 1. The authors use the prespecified early-window mean $\Delta \lambda_{2}[2,5]$ as the primary endpoint (layers 2–5) -- a very important decision that the bucket in the appendix. However, the choice of layers 2-5 seems to be derived using trends observed across the three models, as opposed to being model specific. For instance, if I were to average the F

Reviewer 02Rating 6Confidence 3

Strengths

* The study is robust, covering 20-language design and three diverse families uncover architecture-imprinted patterns, rather than model-specific anecdotes * The idea of a lightweight audit to detect language specialization/brittleness (e.g., Phi-3’s English-specific signature) and preliminary extension to hallucination detection makes the method societally and operationally relevant. This is a nice idea that the community will find interesting and possibly use for

Weaknesses

* While head ablations help, most findings are correlational. The English-specific Phi-3 effect is interpreted as consistent with training emphasis. the paper mentions this is not definitive training-data attribution * Building graphs from softmax attention (often noisy and not strictly causal) and then symmetrizing/aggregating heads may wash out meaningful directionality or head specialization. I don't feel too strongly about this but I think it's worth mentioning * Focusing on voice alternat

Reviewer 03Rating 2Confidence 3

Strengths

- **Interesting idea**: The idea of using graph spectral processing techniques for interpretability is quite natural and interesting. - **Systematic testing** across 20 languages and 3 model families. - **Statistical rigor** (bootstrap CIs, permutation tests, FDR correction, attempts to correct for differences in tokenization, etc.). I'm impressed by the authors' statistical rigor.

Weaknesses

The work is still undermotivated, the results are weak, and it's unclear what this buys you over existing interpretability methods. The writing is needlessly technical and poorly structured. A. Weak motivation and unclear utility - **Why these metrics?** There is no explanation for why we should study the Fiedler value specifically. Appendix A.1 claims theoretical grounding, but the argument is hand-wavy: "models that struggle... may exhibit a breakdown in connectivity... leading to a signific

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeurobiology of Language and Bilingualism · Functional Brain Connectivity Studies · Face Recognition and Perception