Supervector Compression Strategies to Speed up I-Vector System Development
Ville Vestman, Tomi Kinnunen

TL;DR
This paper compares various supervector compression methods, including FEFA, PPCA, FA, SPPCA, and PPLS, for efficient i-vector extraction in speaker verification, demonstrating comparable accuracy and significant speedups.
Contribution
It introduces and evaluates alternative supervector compression techniques, highlighting their efficiency and performance relative to FEFA in speaker verification systems.
Findings
Supervector compression methods achieve similar accuracy to FEFA.
Supervised approaches did not improve performance.
PPCA and FA methods provide over 100x speedup in model training.
Abstract
The front-end factor analysis (FEFA), an extension of principal component analysis (PPCA) tailored to be used with Gaussian mixture models (GMMs), is currently the prevalent approach to extract compact utterance-level features (i-vectors) for automatic speaker verification (ASV) systems. Little research has been conducted comparing FEFA to the conventional PPCA applied to maximum a posteriori (MAP) adapted GMM supervectors. We study several alternative methods, including PPCA, factor analysis (FA), and two supervised approaches, supervised PPCA (SPPCA) and the recently proposed probabilistic partial least squares (PPLS), to compress MAP-adapted GMM supervectors. The resulting i-vectors are used in ASV tasks with a probabilistic linear discriminant analysis (PLDA) back-end. We experiment on two different datasets, on the telephone condition of NIST SRE 2010 and on the recent VoxCeleb…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
