Domain-Dependent Speaker Diarization for the Third DIHARD Challenge
A Kishore Kumar, Shefali Waldekar, Goutam Saha, Md Sahidullah

TL;DR
This paper introduces a domain-dependent speaker diarization system that leverages acoustic domain identification to improve performance, achieving significant DER reductions in the DIHARD III challenge.
Contribution
The work presents a simple, efficient acoustic domain-dependent diarization approach using speaker embeddings, with i-vector based ADI outperforming x-vector methods.
Findings
i-vector based ADI outperforms x-vector in DIHARD dataset
Optimized thresholds and parameters improve diarization accuracy
Achieved 9.63% and 10.64% DER reduction in DIHARD III
Abstract
This report presents the system developed by the ABSP Laboratory team for the third DIHARD speech diarization challenge. Our main contribution in this work is to develop a simple and efficient solution for acoustic domain dependent speech diarization. We explore speaker embeddings for \emph{acoustic domain identification} (ADI) task. Our study reveals that i-vector based method achieves considerably better performance than x-vector based approach in the third DIHARD challenge dataset. Next, we integrate the ADI module with the diarization framework. The performance substantially improved over that of the baseline when we optimized the thresholds for agglomerative hierarchical clustering and the parameters for dimensionality reduction during scoring for individual acoustic domains. We achieved a relative improvement of and in DER for core and full conditions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing
