Multiobjective Optimization Training of PLDA for Speaker Verification

Liang He; Xianhong Chen; Can Xu; Jia Liu

arXiv:1808.08344·cs.SD·November 13, 2018

Multiobjective Optimization Training of PLDA for Speaker Verification

Liang He, Xianhong Chen, Can Xu, Jia Liu

PDF

Open Access 2 Repos

TL;DR

This paper introduces a multi-objective optimization approach for training PLDA in speaker verification, improving speaker distinction and significantly enhancing performance metrics on benchmark datasets.

Contribution

It proposes a novel multi-objective training method for PLDA that balances likelihood maximization with speaker distinction, leading to better verification accuracy.

Findings

01

Over 10% relative improvement in EER and MinDCF on NIST SRE14 dataset

02

Approximately 20% relative improvement in EER on MCE18 dataset

03

Enhanced speaker verification performance through multi-objective optimization

Abstract

Most current state-of-the-art text-independent speaker verification systems take probabilistic linear discriminant analysis (PLDA) as their backend classifiers. The parameters of PLDA are often estimated by maximizing the objective function, which focuses on increasing the value of log-likelihood function, but ignoring the distinction between speakers. In order to better distinguish speakers, we propose a multi-objective optimization training for PLDA. Experiment results show that the proposed method has more than 10% relative performance improvement in both EER and MinDCF on the NIST SRE14 i-vector challenge dataset, and about 20% relative performance improvement in EER on the MCE18 dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing