Multiobjective Optimization Training of PLDA for Speaker Verification
Liang He, Xianhong Chen, Can Xu, Jia Liu

TL;DR
This paper introduces a multi-objective optimization approach for training PLDA in speaker verification, improving speaker distinction and significantly enhancing performance metrics on benchmark datasets.
Contribution
It proposes a novel multi-objective training method for PLDA that balances likelihood maximization with speaker distinction, leading to better verification accuracy.
Findings
Over 10% relative improvement in EER and MinDCF on NIST SRE14 dataset
Approximately 20% relative improvement in EER on MCE18 dataset
Enhanced speaker verification performance through multi-objective optimization
Abstract
Most current state-of-the-art text-independent speaker verification systems take probabilistic linear discriminant analysis (PLDA) as their backend classifiers. The parameters of PLDA are often estimated by maximizing the objective function, which focuses on increasing the value of log-likelihood function, but ignoring the distinction between speakers. In order to better distinguish speakers, we propose a multi-objective optimization training for PLDA. Experiment results show that the proposed method has more than 10% relative performance improvement in both EER and MinDCF on the NIST SRE14 i-vector challenge dataset, and about 20% relative performance improvement in EER on the MCE18 dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing
