Translate Reverberated Speech to Anechoic Ones: Speech Dereverberation with BERT
Yang Jiao

TL;DR
This paper introduces a novel speech dereverberation method using a BERT-based model combined with a neural vocoder, outperforming traditional approaches and matching state-of-the-art results on CHiME challenge data.
Contribution
It applies BERT, a NLP transformer model, to speech dereverberation, incorporating a pre-sequence network and neural vocoder for phase reconstruction, which is a new approach in this domain.
Findings
Outperforms traditional WPE dereverberation method.
Achieves comparable performance with BLSTM-based models.
Demonstrates effectiveness of BERT in speech signal processing.
Abstract
Single channel speech dereverberation is considered in this work. Inspired by the recent success of Bidirectional Encoder Representations from Transformers (BERT) model in the domain of Natural Language Processing (NLP), we investigate its applicability as backbone sequence model to enhance reverberated speech signal. We present a variation of the basic BERT model: a pre-sequence network, which extracts local spectral-temporal information and/or provides order information, before the backbone sequence model. In addition, we use pre-trained neural vocoder for implicit phase reconstruction. To evaluate our method, we used the data from the 3rd CHiME challenge, and compare our results with other methods. Experiments show that the proposed method outperforms traditional method WPE, and achieve comparable performance with state-of-the-art BLSTM-based sequence models.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research
MethodsLinear Layer · Attention Dropout · Adam · Dense Connections · Linear Warmup With Linear Decay · Residual Connection · Dropout · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention
