Speech Enhancement via Two-Stage Dual Tree Complex Wavelet Packet   Transform with a Speech Presence Probability Estimator

Pengfei Sun; Jun Qin

arXiv:1610.00644·cs.SD·April 5, 2017

Speech Enhancement via Two-Stage Dual Tree Complex Wavelet Packet Transform with a Speech Presence Probability Estimator

Pengfei Sun, Jun Qin

PDF

TL;DR

This paper introduces a novel two-stage dual tree complex wavelet packet transform based speech enhancement method that effectively reduces noise and improves speech quality in low SNR conditions by combining SPP estimation and MMSE filtering.

Contribution

It proposes a new two-stage DTCWPT framework with an SPP estimator based on speech and noise models, enhancing speech quality over existing methods.

Findings

01

Outperforms four state-of-the-art algorithms in PESQ scores.

02

Achieves higher segmental SNR at low SNR levels.

03

Effectively reduces nonstationary noise in speech signals.

Abstract

In this paper, a two-stage dual tree complex wavelet packet transform (DTCWPT) based speech enhancement algorithm has been proposed, in which a speech presence probability (SPP) estimator and a generalized minimum mean squared error (MMSE) estimator are developed. To overcome the drawback of signal distortions caused by down sampling of WPT, a two-stage analytic decomposition concatenating undecimated WPT (UWPT) and decimated WPT is employed. An SPP estimator in the DTCWPT domain is derived based on a generalized Gamma distribution of speech, and Gaussian noise assumption. The validation results show that the proposed algorithm can obtain enhanced perceptual evaluation of speech quality (PESQ), and segmental signal-to-noise ratio (SegSNR) at low SNR nonstationary noise, compared with other four state-of-the-art speech enhancement algorithms, including optimally modified LSA (OM-LSA),…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.