Overlap-aware diarization: resegmentation using neural end-to-end overlapped speech detection
Latan\'e Bullock, Herv\'e Bredin, Leibny Paola Garcia-Perera

TL;DR
This paper introduces an overlap-aware diarization system that uses neural LSTM-based overlap detection and resegmentation to improve speaker diarization accuracy, especially in overlapping speech segments.
Contribution
It presents a novel neural LSTM-based overlap detection method integrated into diarization resegmentation, achieving state-of-the-art results on multiple datasets.
Findings
State-of-the-art overlap detection performance on AMI, DIHARD, ETAPE.
20% relative DER reduction on AMI with overlap-aware resegmentation.
Promising directions for handling overlapping speech in diarization.
Abstract
We address the problem of effectively handling overlapping speech in a diarization system. First, we detail a neural Long Short-Term Memory-based architecture for overlap detection. Secondly, detected overlap regions are exploited in conjunction with a frame-level speaker posterior matrix to make two-speaker assignments for overlapped frames in the resegmentation step. The overlap detection module achieves state-of-the-art performance on the AMI, DIHARD, and ETAPE corpora. We apply overlap-aware resegmentation on AMI, resulting in a 20% relative DER reduction over the baseline system. While this approach is by no means an end-all solution to overlap-aware diarization, it reveals promising directions for handling overlap.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
