BUT System Description for DIHARD Speech Diarization Challenge 2019
Federico Landini, Shuai Wang, Mireia Diez, Luk\'a\v{s} Burget, Pavel, Mat\v{e}jka, Kate\v{r}ina \v{Z}mol\'ikov\'a, Ladislav Mo\v{s}ner, Old\v{r}ich, Plchot, Ond\v{r}ej Novotn\'y, Hossein Zeinali, Johan Rohdin

TL;DR
This paper details the BUT team's speech diarization systems for the DIHARD 2019 challenge, utilizing clustering and HMM techniques across multiple tracks to improve speaker segmentation accuracy.
Contribution
It introduces a multi-track diarization approach combining AHC, x-vectors, and Bayesian HMMs, tailored for different challenge tracks.
Findings
Effective clustering and HMM integration improved diarization performance.
Systems achieved competitive results in DIHARD 2019 challenge.
Multi-channel x-vector extraction enhanced speaker segmentation.
Abstract
This paper describes the systems developed by the BUT team for the four tracks of the second DIHARD speech diarization challenge. For tracks 1 and 2 the systems were based on performing agglomerative hierarchical clustering (AHC) over x-vectors, followed by the Bayesian Hidden Markov Model (HMM) with eigenvoice priors applied at x-vector level followed by the same approach applied at frame level. For tracks 3 and 4, the systems were based on performing AHC using x-vectors extracted on all channels.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Natural Language Processing Techniques
