Reformulating Speaker Diarization as Community Detection With Emphasis On Topological Structure
Siqi Zheng, Hongbin Suo

TL;DR
This paper redefines speaker diarization as a community detection problem, leveraging topological structure and novel techniques to significantly improve clustering accuracy and reduce diarization error rate.
Contribution
It introduces a community detection framework for speaker diarization, outperforming traditional clustering methods and integrating topological analysis with an end-to-end system.
Findings
Leiden algorithm outperforms previous clustering methods
Dimensionality reduction preserves topological structure
End-to-end system achieves up to 70% DER reduction
Abstract
Clustering-based speaker diarization has stood firm as one of the major approaches in reality, despite recent development in end-to-end diarization. However, clustering methods have not been explored extensively for speaker diarization. Commonly-used methods such as k-means, spectral clustering, and agglomerative hierarchical clustering only take into account properties such as proximity and relative densities. In this paper we propose to view clustering-based diarization as a community detection problem. By doing so the topological structure is considered. This work has four major contributions. First it is shown that Leiden community detection algorithm significantly outperforms the previous methods on the clustering of speaker-segments. Second, we propose to use uniform manifold approximation to reduce dimension while retaining global and local topological structure. Third, a masked…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Advanced Clustering Algorithms Research
