DOVER-Lap: A Method for Combining Overlap-aware Diarization Outputs

Desh Raj; Leibny Paola Garcia-Perera; Zili Huang; Shinji Watanabe,; Daniel Povey; Andreas Stolcke; Sanjeev Khudanpur

arXiv:2011.01997·eess.AS·November 5, 2020

DOVER-Lap: A Method for Combining Overlap-aware Diarization Outputs

Desh Raj, Leibny Paola Garcia-Perera, Zili Huang, Shinji Watanabe,, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur

PDF

1 Repo

TL;DR

DOVER-Lap is a novel ensemble method for overlapping speaker diarization outputs that improves accuracy by combining diverse systems and can be used for late fusion in multichannel scenarios.

Contribution

It introduces a new algorithm for combining overlapping diarization outputs using weighted graph matching, extending the DOVER framework.

Findings

01

DOVER-Lap outperforms the best single system on AMI and LibriCSS datasets.

02

It effectively combines diverse diarization systems including clustering, RPN, and VAD.

03

The method is also effective for late fusion in multichannel diarization.

Abstract

Several advances have been made recently towards handling overlapping speech for speaker diarization. Since speech and natural language tasks often benefit from ensemble techniques, we propose an algorithm for combining outputs from such diarization systems through majority voting. Our method, DOVER-Lap, is inspired from the recently proposed DOVER algorithm, but is designed to handle overlapping segments in diarization outputs. We also modify the pair-wise incremental label mapping strategy used in DOVER, and propose an approximation algorithm based on weighted k-partite graph matching, which performs this mapping using a global cost tensor. We demonstrate the strength of our method by combining outputs from diverse systems -- clustering-based, region proposal networks, and target-speaker voice activity detection -- on AMI and LibriCSS datasets, where it consistently outperforms the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

desh2608/dover-lap
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.