Weakly Supervised PLDA Training

Lantian Li; Yixiang Chen; Dong Wang; Chenghui Zhao

arXiv:1609.08441·cs.LG·May 24, 2017·1 cites

Weakly Supervised PLDA Training

Lantian Li, Yixiang Chen, Dong Wang, Chenghui Zhao

PDF

Open Access

TL;DR

This paper proposes a cost-effective weakly supervised training method for PLDA in speaker verification, leveraging session-based assumptions to reduce labeling costs while maintaining performance.

Contribution

It introduces a novel weak supervision approach for PLDA training that requires less labeled data and can be used for discriminative adaptation, improving efficiency.

Findings

01

Weak training achieves good performance with limited labeled data.

02

Weak training can be used as an effective discriminative adaptation method.

03

Method is validated on large-scale telephony data.

Abstract

PLDA is a popular normalization approach for the i-vector model, and it has delivered state-of-the-art performance in speaker verification. However, PLDA training requires a large amount of labelled development data, which is highly expensive in most cases. We present a cheap PLDA training approach, which assumes that speakers in the same session can be easily separated, and speakers in different sessions are simply different. This results in `weak labels' which are not fully accurate but cheap, leading to a weak PLDA training. Our experimental results on real-life large-scale telephony customer service achieves demonstrated that the weak training can offer good performance when human-labelled data are limited. More interestingly, the weak training can be employed as a discriminative adaptation approach, which is more efficient than the prevailing unsupervised method when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing