Deficient Basis Estimation of Noise Spatial Covariance Matrix for   Rank-Constrained Spatial Covariance Matrix Estimation Method in Blind Speech   Extraction

Yuto Kondo; Yuki Kubo; Norihiro Takamune; Daichi Kitamura; Hiroshi; Saruwatari

arXiv:2105.02491·cs.SD·May 7, 2021

Deficient Basis Estimation of Noise Spatial Covariance Matrix for Rank-Constrained Spatial Covariance Matrix Estimation Method in Blind Speech Extraction

Yuto Kondo, Yuki Kubo, Norihiro Takamune, Daichi Kitamura, Hiroshi, Saruwatari

PDF

Open Access

TL;DR

This paper introduces an improved RCSCME algorithm for blind speech extraction that accurately estimates deficient basis vectors of diffuse noise covariance matrices, enhancing performance in noisy environments.

Contribution

It proposes a novel RCSCME extension that estimates deficient basis vectors directly, using EM-based update rules, improving noise covariance estimation accuracy.

Findings

01

Outperforms conventional RCSCME in various noise conditions

02

Accurately estimates deficient basis vectors of noise covariance matrices

03

Demonstrates improved speech extraction quality

Abstract

Rank-constrained spatial covariance matrix estimation (RCSCME) is a state-of-the-art blind speech extraction method applied to cases where one directional target speech and diffuse noise are mixed. In this paper, we proposed a new algorithmic extension of RCSCME. RCSCME complements a deficient one rank of the diffuse noise spatial covariance matrix, which cannot be estimated via preprocessing such as independent low-rank matrix analysis, and estimates the source model parameters simultaneously. In the conventional RCSCME, a direction of the deficient basis is fixed in advance and only the scale is estimated; however, the candidate of this deficient basis is not unique in general. In the proposed RCSCME model, the deficient basis itself can be accurately estimated as a vector variable by solving a vector optimization problem. Also, we derive new update rules based on the EM algorithm. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Advanced Adaptive Filtering Techniques