Under-determined reverberant audio source separation using a full-rank   spatial covariance model

Ngoc Duong (INRIA - Irisa); Emmanuel Vincent (INRIA - Irisa); Remi; Gribonval (INRIA - Irisa)

arXiv:0912.0171·stat.ML·December 14, 2009·IEEE Trans. Speech Audio Process.·1 cites

Under-determined reverberant audio source separation using a full-rank spatial covariance model

Ngoc Duong (INRIA - Irisa), Emmanuel Vincent (INRIA - Irisa), Remi, Gribonval (INRIA - Irisa)

PDF

Open Access

TL;DR

This paper introduces a novel full-rank spatial covariance model for under-determined reverberant audio source separation, utilizing EM algorithms to estimate source contributions in reverberant environments.

Contribution

It proposes a full-rank unconstrained covariance model and develops EM algorithms for effective source separation in reverberant, under-determined settings.

Findings

01

Effective separation in synthetic reverberant mixtures

02

Successful application to live speech recordings

03

Improved source localization accuracy

Abstract

This article addresses the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation. We model the contribution of each source to all mixture channels in the time-frequency domain as a zero-mean Gaussian random variable whose covariance encodes the spatial characteristics of the source. We then consider four specific covariance models, including a full-rank unconstrained model. We derive a family of iterative expectationmaximization (EM) algorithms to estimate the parameters of each model and propose suitable procedures to initialize the parameters and to align the order of the estimated sources across all frequency bins based on their estimated directions of arrival (DOA). Experimental results over reverberant synthetic mixtures and live recordings of speech data show the effectiveness of the proposed approach.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Advanced Adaptive Filtering Techniques