The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge   2022 System Description

Yannis Tevissen (ARMEDIA-SAMOVAR); J\'er\^ome Boudy (ARMEDIA-SAMOVAR),; Fr\'ed\'eric Petitpont

arXiv:2301.07491·cs.SD·January 19, 2023

The Newsbridge -Telecom SudParis VoxCeleb Speaker Recognition Challenge 2022 System Description

Yannis Tevissen (ARMEDIA-SAMOVAR), J\'er\^ome Boudy (ARMEDIA-SAMOVAR),, Fr\'ed\'eric Petitpont

PDF

Open Access

TL;DR

This paper presents a novel multi-stream voice activity detection method combined with standard diarization techniques, achieving near state-of-the-art speaker diarization results in the VoxCeleb Challenge 2022.

Contribution

It introduces a multi-stream voice activity detection approach with a classifier entropy-based decision protocol, enhancing diarization performance.

Findings

01

Achieved near state-of-the-art results in speaker diarization.

02

Demonstrated effectiveness of combining multiple VAD algorithms.

03

Showed that strong baseline methods can yield competitive results.

Abstract

We describe the system used by our team for the VoxCeleb Speaker Recognition Challenge 2022 (VoxSRC 2022) in the speaker diarization track. Our solution was designed around a new combination of voice activity detection algorithms that uses the strengths of several systems. We introduce a novel multi stream approach with a decision protocol based on classifiers entropy. We called this method a multi-stream voice activity detection and used it with standard baseline diarization embeddings, clustering and resegmentation. With this work, we successfully demonstrated that using a strong baseline and working only on voice activity detection, one can achieved close to state-of-theart results.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Speech and dialogue systems