Universal speaker recognition encoders for different speech segments duration
Sergey Novoselov, Vladimir Volokhov, Galina Lavrentyeva

TL;DR
This paper proposes a universal speaker encoder trained to perform reliably across varying speech segment durations, improving speaker verification accuracy without increasing inference time.
Contribution
The paper introduces a simple training recipe for creating universal speaker encoders applicable to any neural network architecture, enhancing robustness across speech durations.
Findings
Universal encoder improves verification for different speech durations
Evaluation on NIST SRE and VoxCeleb1 benchmarks shows performance gains
Encoder maintains same inference time as the underlying neural network
Abstract
Creating universal speaker encoders which are robust for different acoustic and speech duration conditions is a big challenge today. According to our observations systems trained on short speech segments are optimal for short phrase speaker verification and systems trained on long segments are superior for long segments verification. A system trained simultaneously on pooled short and long speech segments does not give optimal verification results and usually degrades both for short and long segments. This paper addresses the problem of creating universal speaker encoders for different speech segments duration. We describe our simple recipe for training universal speaker encoder for any type of selected neural network architecture. According to our evaluation results of wav2vec-TDNN based systems obtained for NIST SRE and VoxCeleb1 benchmarks the proposed universal encoder provides…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Advanced Data Compression Techniques
MethodsTest
