Noise Classification Aided Attention-Based Neural Network for Monaural Speech Enhancement
Lu Ma, Song Yang, Yaguang Gong, Zhongqin Wu

TL;DR
This paper introduces a neural network for monaural speech enhancement that incorporates noise classification and attention mechanisms, improving speech quality and generalization to unseen noise types.
Contribution
It presents a novel end-to-end neural network integrating noise classification with attention for enhanced speech processing.
Findings
Outperforms OM-LSA and previous methods in speech quality (PESQ)
Demonstrates better generalization to unseen noise conditions
Uses a combined LSTM, attention, and classification architecture
Abstract
This paper proposes an noise type classification aided attention-based neural network approach for monaural speech enhancement. The network is constructed based on a previous work by introducing a noise classification subnetwork into the structure and taking the classification embedding into the attention mechanism for guiding the network to make better feature extraction. Specifically, to make the network an end-to-end way, an audio encoder and decoder constructed by temporal convolution is used to make transformation between waveform and spectrogram. Additionally, our model is composed of two long short term memory (LSTM) based encoders, two attention mechanism, a noise classifier and a speech mask generator. Experiments show that, compared with OM-LSA and the previous work, the proposed noise classification aided attention-based approach can achieve better performance in terms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Advanced Adaptive Filtering Techniques
