Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization
Jie Wang, Zhicong Chen, Haodong Zhou, Lin Li, Qingyang Hong

TL;DR
This paper introduces CDGCN, a graph-based clustering method that enhances speaker diarization by capturing local and global speaker relationships, especially for overlapping speech segments.
Contribution
The paper presents a novel graph convolutional network approach for overlap-aware speaker diarization, integrating community detection and overlapped speech detection.
Findings
Outperforms traditional clustering methods on DIHARD III corpus
Effectively detects overlapping speech segments
Improves overall diarization accuracy
Abstract
The clustering algorithm plays a crucial role in speaker diarization systems. However, traditional clustering algorithms suffer from the complex distribution of speaker embeddings and lack of digging potential relationships between speakers in a session. We propose a novel graph-based clustering approach called Community Detection Graph Convolutional Network (CDGCN) to improve the performance of the speaker diarization system. The CDGCN-based clustering method consists of graph generation, sub-graph detection, and Graph-based Overlapped Speech Detection (Graph-OSD). Firstly, the graph generation refines the local linkages among speech segments. Secondly the sub-graph detection finds the optimal global partition of the speaker graph. Finally, we view speaker clustering for overlap-aware speaker diarization as an overlapped community detection task and design a Graph-OSD component to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
