Correlation Distance Skip Connection Denoising Autoencoder (CDSK-DAE) for Speech Feature Enhancement
Alzahra Badi, Sangwook Park, David K. Han, Hanseok Ko

TL;DR
This paper introduces a novel correlation distance skip connection denoising autoencoder (CDSK-DAE) for speech feature enhancement, significantly improving noise robustness in end-to-end ASR systems across various noisy environments.
Contribution
It proposes a new autoencoder architecture with skip connections and a correlation distance-based training objective, enhancing speech features for noise-robust ASR.
Findings
Outperforms conventional and state-of-the-art models in noisy conditions
Improves word error rate (WER) across multiple noise types and SNR levels
Effective with both linear and non-linear penalty terms
Abstract
Performance of learning based Automatic Speech Recognition (ASR) is susceptible to noise, especially when it is introduced in the testing data while not presented in the training data. This work focuses on a feature enhancement for noise robust end-to-end ASR system by introducing a novel variant of denoising autoencoder (DAE). The proposed method uses skip connections in both encoder and decoder sides by passing speech information of the target frame from input to the model. It also uses a new objective function in training model that uses a correlation distance measure in penalty terms by measuring dependency of the latent target features and the model (latent features and enhanced features obtained from the DAE). Performance of the proposed method was compared against a conventional model and a state of the art model under both seen and unseen noisy environments of 7 different types…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
MethodsDenoising Autoencoder · Solana Customer Service Number +1-833-534-1729
