Novel Architectures for Unsupervised Information Bottleneck based Speaker Diarization of Meetings
Nauman Dawalatabad, Srikanth Madikeri, C. Chandra Sekhar, Hema A., Murthy

TL;DR
This paper introduces novel unsupervised speaker diarization architectures that improve segmentation initialization and incorporate speaker discriminative features, resulting in significant performance gains on standard meeting datasets.
Contribution
It proposes a varying length segment initialization technique and a Two-Pass Information Bottleneck framework that together enhance unsupervised speaker diarization performance.
Findings
Achieved 3.9% and 4.7% absolute improvement on NIST and AMI datasets.
Demonstrated the effectiveness of combining segment initialization with discriminative features.
Improved baseline IB-based diarization with novel initialization and two-pass approach.
Abstract
Speaker diarization is an important problem that is topical, and is especially useful as a preprocessor for conversational speech related applications. The objective of this paper is two-fold: (i) segment initialization by uniformly distributing speaker information across the initial segments, and (ii) incorporating speaker discriminative features within the unsupervised diarization framework. In the first part of the work, a varying length segment initialization technique for Information Bottleneck (IB) based speaker diarization system using phoneme rate as the side information is proposed. This initialization distributes speaker information uniformly across the segments and provides a better starting point for IB based clustering. In the second part of the work, we present a Two-Pass Information Bottleneck (TPIB) based speaker diarization system that incorporates speaker…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
