Structured Sparsity Models for Multiparty Speech Recovery from   Reverberant Recordings

Afsaneh Asaei; Mohammad Golbabaee; Herv\'e Bourlard; Volkan Cevher

arXiv:1210.6766·cs.LG·October 26, 2012·5 cites

Structured Sparsity Models for Multiparty Speech Recovery from Reverberant Recordings

Afsaneh Asaei, Mohammad Golbabaee, Herv\'e Bourlard, Volkan Cevher

PDF

Open Access

TL;DR

This paper introduces a novel structured sparsity approach for recovering and separating multi-party speech in reverberant environments by modeling room acoustics and leveraging sparse and low-rank structures.

Contribution

It proposes a new method for characterizing room acoustics and recovering speech using structured sparsity and convex optimization, advancing multi-party speech processing in reverberant settings.

Findings

01

Effective room modeling from unknown sources

02

Improved speech separation accuracy

03

Robustness demonstrated on real recordings

Abstract

We tackle the multi-party speech recovery problem through modeling the acoustic of the reverberant chambers. Our approach exploits structured sparsity models to perform room modeling and speech recovery. We propose a scheme for characterizing the room acoustic from the unknown competing speech sources relying on localization of the early images of the speakers by sparse approximation of the spatial spectra of the virtual sources in a free-space model. The images are then clustered exploiting the low-rank structure of the spectro-temporal components belonging to each source. This enables us to identify the early support of the room impulse response function and its unique map to the room geometry. To further tackle the ambiguity of the reflection ratios, we propose a novel formulation of the reverberation model and estimate the absorption coefficients through a convex optimization…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Blind Source Separation Techniques