Supervised online diarization with sample mean loss for multi-domain   data

Enrico Fini; Alessio Brutti

arXiv:1911.01266·eess.AS·November 14, 2019

Supervised online diarization with sample mean loss for multi-domain data

Enrico Fini, Alessio Brutti

PDF

1 Repo

TL;DR

This paper introduces a novel supervised online speaker diarization method that uses a new loss function and improved modeling of speaker turn behavior, achieving better efficiency and performance on multi-domain data.

Contribution

The paper proposes Sample Mean Loss and analytical modeling of speaker turn probability, enhancing the UIS-RNN framework for multi-domain speaker diarization.

Findings

01

Improved diarization performance over original UIS-RNN.

02

Comparable results to offline clustering baselines.

03

Effective training on fixed-length speech segments.

Abstract

Recently, a fully supervised speaker diarization approach was proposed (UIS-RNN) which models speakers using multiple instances of a parameter-sharing recurrent neural network. In this paper we propose qualitative modifications to the model that significantly improve the learning efficiency and the overall diarization performance. In particular, we introduce a novel loss function, we called Sample Mean Loss and we present a better modelling of the speaker turn behaviour, by devising an analytical expression to compute the probability of a new speaker joining the conversation. In addition, we demonstrate that our model can be trained on fixed-length speech segments, removing the need for speaker change information in inference. Using x-vectors as input features, we evaluate our proposed approach on the multi-domain dataset employed in the DIHARD II challenge: our online method improves…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

DonkeyShot21/uis-rnn-sml
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.