Mechanistic Interpretability of Brain-to-Speech Models Across Speech Modes
Maryam Maghsoudi, Ayushi Mishra

TL;DR
This paper investigates how brain-to-speech models internally represent different speech modes, revealing that shared continuous representations are organized hierarchically and are mediated by localized subspaces within the model.
Contribution
It introduces a causal interpretability framework to analyze internal representations, uncovering the hierarchical and shared structure of speech modality encoding.
Findings
Speech modes share a continuous causal manifold.
Cross-mode transfer relies on compact, layer-specific subspaces.
Localized neuron subsets influence cross-mode transfer.
Abstract
Brain-to-speech decoding models demonstrate robust performance in vocalized, mimed, and imagined speech; yet, the fundamental mechanisms via which these models capture and transmit information across different speech modalities are less explored. In this work, we use mechanistic interpretability to causally investigate the internal representations of a neural speech decoder. We perform cross-mode activation patching of internal activations across speech modes, and use tri-modal interpolation to examine whether speech representations vary discretely or continuously. We use coarse-to-fine causal tracing and causal scrubbing to find localized causal structure, allowing us to find internal subspaces that are sufficient for cross-mode transfer. In order to determine how finely distributed these effects are within layers, we perform neuron-level activation patching. We discover that small but…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural dynamics and brain function · Neuroscience and Music Perception · Phonetics and Phonology Research
