TL;DR
This paper introduces a novel method for speech enhancement that estimates power spectral density instantaneously using generalized principal components, leading to improved performance over traditional smooth PSD estimation methods.
Contribution
The paper proposes a new instantaneous PSD estimation technique based on generalized principal components, enhancing speech enhancement by capturing time-varying spectral information.
Findings
Instantaneous PSD estimates outperform smooth estimates in speech enhancement.
Simulation shows improved speech quality with the proposed method.
Method provides real-time PSD estimation using generalized eigenvalues.
Abstract
Power spectral density (PSD) estimates of various microphone signal components are essential to many speech enhancement procedures. As speech is highly non-nonstationary, performance improvements may be gained by maintaining time-variations in PSD estimates. In this paper, we propose an instantaneous PSD estimation approach based on generalized principal components. Similarly to other eigenspace-based PSD estimation approaches, we rely on recursive averaging in order to obtain a microphone signal correlation matrix estimate to be decomposed. However, instead of estimating the PSDs directly from the temporally smooth generalized eigenvalues of this matrix, yielding temporally smooth PSD estimates, we propose to estimate the PSDs from newly defined instantaneous generalized eigenvalues, yielding instantaneous PSD estimates. The instantaneous generalized eigenvalues are defined from the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
