A Systematic Comparison of Phonetic Aware Techniques for Speech Enhancement
Or Tal, Moshe Mandel, Felix Kreuk, Yossi Adi

TL;DR
This paper systematically compares various methods of incorporating phonetic information into neural speech enhancement models, evaluating their impact on performance with different feature sources and injection techniques.
Contribution
It provides a comprehensive analysis of phonetic feature integration methods, highlighting the effectiveness of feature conditioning and the superiority of SSL-based features.
Findings
SSL features outperform ASR features in most cases
Feature conditioning yields the best enhancement performance
Different embedding layers influence the effectiveness of phonetic features
Abstract
Speech enhancement has seen great improvement in recent years using end-to-end neural networks. However, most models are agnostic to the spoken phonetic content. Recently, several studies suggested phonetic-aware speech enhancement, mostly using perceptual supervision. Yet, injecting phonetic features during model optimization can take additional forms (e.g., model conditioning). In this paper, we conduct a systematic comparison between different methods of incorporating phonetic information in a speech enhancement model. By conducting a series of controlled experiments, we observe the influence of different phonetic content models as well as various feature-injection techniques on enhancement performance, considering both causal and non-causal models. Specifically, we evaluate three settings for injecting phonetic information, namely: i) feature conditioning; ii) perceptual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research
