Speech Enhancement via Two-Stage Dual Tree Complex Wavelet Packet Transform with a Speech Presence Probability Estimator
Pengfei Sun, Jun Qin

TL;DR
This paper introduces a novel two-stage dual tree complex wavelet packet transform based speech enhancement method that effectively reduces noise and improves speech quality in low SNR conditions by combining SPP estimation and MMSE filtering.
Contribution
It proposes a new two-stage DTCWPT framework with an SPP estimator based on speech and noise models, enhancing speech quality over existing methods.
Findings
Outperforms four state-of-the-art algorithms in PESQ scores.
Achieves higher segmental SNR at low SNR levels.
Effectively reduces nonstationary noise in speech signals.
Abstract
In this paper, a two-stage dual tree complex wavelet packet transform (DTCWPT) based speech enhancement algorithm has been proposed, in which a speech presence probability (SPP) estimator and a generalized minimum mean squared error (MMSE) estimator are developed. To overcome the drawback of signal distortions caused by down sampling of WPT, a two-stage analytic decomposition concatenating undecimated WPT (UWPT) and decimated WPT is employed. An SPP estimator in the DTCWPT domain is derived based on a generalized Gamma distribution of speech, and Gaussian noise assumption. The validation results show that the proposed algorithm can obtain enhanced perceptual evaluation of speech quality (PESQ), and segmental signal-to-noise ratio (SegSNR) at low SNR nonstationary noise, compared with other four state-of-the-art speech enhancement algorithms, including optimally modified LSA (OM-LSA),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
