Joint Bayesian Gaussian discriminant analysis for speaker verification

Yiyan Wang; Haotian Xu; Zhijian Ou

arXiv:1612.04056·cs.SD·January 20, 2017

Joint Bayesian Gaussian discriminant analysis for speaker verification

Yiyan Wang, Haotian Xu, Zhijian Ou

PDF

Open Access

TL;DR

This paper applies the joint Bayesian method to speaker verification, improving performance over traditional PLDA by using exact statistics, efficient diagonalization, and extensive experiments demonstrating 9-13% EER reduction.

Contribution

It introduces exact EM iterations, simultaneous diagonalization for efficiency, and a comprehensive analysis of Gaussian PLDA and JB in speaker verification.

Findings

01

JB achieves 9-13% EER reduction over PLDA.

02

Faster convergence rate with JB.

03

Enhanced understanding of Gaussian PLDA and JB differences.

Abstract

State-of-the-art i-vector based speaker verification relies on variants of Probabilistic Linear Discriminant Analysis (PLDA) for discriminant analysis. We are mainly motivated by the recent work of the joint Bayesian (JB) method, which is originally proposed for discriminant analysis in face verification. We apply JB to speaker verification and make three contributions beyond the original JB. 1) In contrast to the EM iterations with approximated statistics in the original JB, the EM iterations with exact statistics are employed and give better performance. 2) We propose to do simultaneous diagonalization (SD) of the within-class and between-class covariance matrices to achieve efficient testing, which has broader application scope than the SVD-based efficient testing method in the original JB. 3) We scrutinize similarities and differences between various Gaussian PLDAs and JB,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Face and Expression Recognition · Bayesian Methods and Mixture Models