Sub-vector Extraction and Cascade Post-Processing for Speaker Verification Using MLLR Super-vectors
A. K. Sarkar, C. Barras, V. B. Le, D. Matrouf

TL;DR
This paper introduces a speaker verification system using MLLR super-vectors and m-vectors, employing cascade post-processing and LDA for improved accuracy, validated on NIST SRE datasets.
Contribution
It proposes a novel cascade post-processing approach combining LDA and PLDA for MLLR super-vectors, enhancing speaker verification performance.
Findings
Significant performance improvement over conventional MLLR super-vector systems.
Cascade post-processing reduces error rates across datasets.
Fusion with i-vector systems demonstrates complementary benefits.
Abstract
In this paper, we propose a speaker-verification system based on maximum likelihood linear regression (MLLR) super-vectors, for which speakers are characterized by m-vectors. These vectors are obtained by a uniform segmentation of the speaker MLLR super-vector using an overlapped sliding window. We consider three approaches for MLLR transformation, based on the conventional -best automatic transcription, on the lattice word transcription, or on a simple global universal background model (UBM). Session variability compensation is performed in a post-processing module with probabilistic linear discriminant analysis (PLDA) or the eigen factor radial (EFR). Alternatively, we propose a cascade post-processing for the MLLR super-vector based speaker-verification system. In this case, the m-vectors or MLLR super-vectors are first projected onto a lower-dimensional vector space generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
