Wavelet speech enhancement based on nonnegative matrix factorization
Syu-Siang Wang, Alan Chern, Yu Tsao, Jeih-weih Hung, Xugang Lu,, Ying-Hui Lai, Borching Su

TL;DR
This paper introduces a novel speech enhancement technique combining discrete wavelet packet transform and nonnegative matrix factorization, which improves speech quality and intelligibility without signal distortion.
Contribution
The study proposes a new DWPT-NMF based method that avoids spectrogram distortion and enhances speech signals more effectively than traditional STFT-NMF approaches.
Findings
Outperforms traditional STFT-NMF in speech quality
Enhances speech intelligibility in noisy environments
Maintains signal integrity without distortion
Abstract
For most of the state-of-the-art speech enhancement techniques, a spectrogram is usually preferred than the respective time-domain raw data since it reveals more compact presentation together with conspicuous temporal information over a long time span. However, the short-time Fourier transform (STFT) that creates the spectrogram in general distorts the original signal and thereby limits the capability of the associated speech enhancement techniques. In this study, we propose a novel speech enhancement method that adopts the algorithms of discrete wavelet packet transform (DWPT) and nonnegative matrix factorization (NMF) in order to conquer the aforementioned limitation. In brief, the DWPT is first applied to split a time-domain speech signal into a series of subband signals without introducing any distortion. Then we exploit NMF to highlight the speech component for each subband.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
