Model-Based Speech Enhancement in the Modulation Domain
Yu Wang, Mike Brookes

TL;DR
This paper introduces a novel modulation-domain speech enhancement algorithm using a Kalman filter and a Gaussring noise model, improving speech quality and intelligibility across various noise conditions.
Contribution
It proposes a new statistical Gaussring model for noise and jointly models speech and noise dynamics for enhanced speech estimation.
Findings
Consistent improvement in speech quality over competing algorithms.
Effective in various SNR conditions.
Performs well in speech recognition tasks with different noise types.
Abstract
This paper presents an algorithm for modulation-domain speech enhancement using a Kalman filter. The proposed estimator jointly models the estimated dynamics of the spectral amplitudes of speech and noise to obtain an MMSE estimation of the speech amplitude spectrum with the assumption that the speech and noise are additive in the complex domain. In order to include the dynamics of noise amplitudes with those of speech amplitudes, we propose a statistical "Gaussring" model that comprises a mixture of Gaussians whose centers lie in a circle on the complex plane. The performance of the proposed algorithm is evaluated using the perceptual evaluation of speech quality measure, segmental SNR measure, and short-time objective intelligibility measure. For speech quality measures, the proposed algorithm is shown to give a consistent improvement over a wide range of SNRs when compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
