Where Does Speech Enhancement Adapt? Probing Study Under Controlled Degradation
Yair Amar, Amir Ivry, Israel Cohen

TL;DR
This study investigates how speech enhancement models internally adapt to input degradations like noise and reverberation, revealing layer-specific robustness and the influence of training objectives.
Contribution
It introduces a probing method to analyze internal representations under controlled degradations, highlighting where and how models adapt.
Findings
Encoder layers maintain noise-invariant representations.
Decoder layers adapt strongly to degradations.
Structural robustness patterns are consistent across different architectures.
Abstract
Speech enhancement (SE) models advance rapidly, yet it remains underexplored how degradation of input signals affects their internal representations. We introduce a probing process, aimed at modeling the behavior of internal representations in SE models under controlled degradations to input signals. We apply it to the MUSE SE model by extracting its layer activations under controlled Signal-to-Noise Ratio (SNR) and reverberation C50. We measure layer-wise representational similarity to clean input references using Centered Kernel Alignment (CKA) and regress it against the degradation level, yielding compact, robustness-adaptive profiles. Encoder layers maintain noise-invariant representations while decoder layers adapt strongly, with sensitivity increasing monotonically within blocks and skip-connection boundaries marking the sharpest transitions. The same structure emerges under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
