TL;DR
This paper introduces a fast variational Bayes training algorithm for heavy-tailed PLDA, improving speaker recognition backend efficiency for i-vectors and x-vectors without length normalization.
Contribution
It presents a novel, fast variational Bayes method for generative training of heavy-tailed PLDA, enhancing computational efficiency over previous approaches.
Findings
Heavy-tailed PLDA matches Gaussian PLDA accuracy without length normalization.
The new variational Bayes training reduces computational costs.
Experimental results on SRE and SITW datasets demonstrate improved performance.
Abstract
The standard state-of-the-art backend for text-independent speaker recognizers that use i-vectors or x-vectors, is Gaussian PLDA (G-PLDA), assisted by a Gaussianization step involving length normalization. G-PLDA can be trained with both generative or discriminative methods. It has long been known that heavy-tailed PLDA (HT-PLDA), applied without length normalization, gives similar accuracy, but at considerable extra computational cost. We have recently introduced a fast scoring algorithm for a discriminatively trained HT-PLDA backend. This paper extends that work by introducing a fast, variational Bayes, generative training algorithm. We compare old and new backends, with and without length-normalization, with i-vectors and x-vectors, on SRE'10, SRE'16 and SITW.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
