Weakly Supervised PLDA Training
Lantian Li, Yixiang Chen, Dong Wang, Chenghui Zhao

TL;DR
This paper proposes a cost-effective weakly supervised training method for PLDA in speaker verification, leveraging session-based assumptions to reduce labeling costs while maintaining performance.
Contribution
It introduces a novel weak supervision approach for PLDA training that requires less labeled data and can be used for discriminative adaptation, improving efficiency.
Findings
Weak training achieves good performance with limited labeled data.
Weak training can be used as an effective discriminative adaptation method.
Method is validated on large-scale telephony data.
Abstract
PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification. However, PLDA training requires a large amount of labelled development data, which is highly expensive in most cases. We present a cheap PLDA training approach, which assumes that speakers in the same session can be easily separated, and speakers in different sessions are simply different. This results in `weak labels' which are not fully accurate but cheap, leading to a weak PLDA training. Our experimental results on real-life large-scale telephony customer service achieves demonstrated that the weak training can offer good performance when human-labelled data are limited. More interestingly, the weak training can be employed as a discriminative adaptation approach, which is more efficient than the prevailing unsupervised method when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
