Simultaneous Diarization and Separation of Meetings through the   Integration of Statistical Mixture Models

Tobias Cord-Landwehr; Christoph Boeddeker; Reinhold Haeb-Umbach

arXiv:2410.21455·eess.AS·February 25, 2025·ICASSP

Simultaneous Diarization and Separation of Meetings through the Integration of Statistical Mixture Models

Tobias Cord-Landwehr, Christoph Boeddeker, Reinhold Haeb-Umbach

PDF

Open Access

TL;DR

This paper introduces a joint statistical framework combining cACGMM and VMFMM models for simultaneous diarization and separation of meeting speech, enabling block-wise processing and improved word error rates.

Contribution

It presents a novel integrated approach for diarization and separation using statistical mixture models, including a new method for counting active speakers per segment.

Findings

01

Outperforms cascaded methods in WER on LibriCSS corpus

02

Supports block-online processing with speaker counting

03

Effectively exploits spatial and spectral information

Abstract

We propose an approach for simultaneous diarization and separation of meeting data. It consists of a complex Angular Central Gaussian Mixture Model (cACGMM) for speech source separation, and a von-Mises-Fisher Mixture Model (VMFMM) for diarization in a joint statistical framework. Through the integration, both spatial and spectral information are exploited for diarization and separation. We also develop a method for counting the number of active speakers in a segment of a meeting to support block-wise processing. While the total number of speakers in a meeting may be known, it is usually not known on a per-segment level. With the proposed speaker counting, joint diarization and source separation can be done segment-by-segment, and the permutation problem across segments is solved, thus allowing for block-online processing in the future. Experimental results on the LibriCSS meeting…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Methods and Mixture Models · Advanced Clustering Algorithms Research