Prototype and Instance Contrastive Learning for Unsupervised Domain   Adaptation in Speaker Verification

Wen Huang; Bing Han; Zhengyang Chen; Shuai Wang; Yanmin Qian

arXiv:2410.17033·eess.AS·October 23, 2024

Prototype and Instance Contrastive Learning for Unsupervised Domain Adaptation in Speaker Verification

Wen Huang, Bing Han, Zhengyang Chen, Shuai Wang, Yanmin Qian

PDF

Open Access

TL;DR

This paper introduces PICL, a dual-level contrastive learning approach for unsupervised domain adaptation in speaker verification, improving robustness and generalization across diverse mismatch scenarios.

Contribution

The paper proposes a novel dual-level contrastive learning method, combining prototype and instance contrastive learning, for better unsupervised domain adaptation in speaker verification.

Findings

01

Achieved state-of-the-art performance on multiple datasets.

02

Demonstrated improved robustness across various mismatch scenarios.

03

Validated the generalization capability of PICL.

Abstract

Speaker verification system trained on one domain usually suffers performance degradation when applied to another domain. To address this challenge, researchers commonly use feature distribution matching-based methods in unsupervised domain adaptation scenarios where some unlabeled target domain data is available. However, these methods often have limited performance improvement and lack generalization in various mismatch situations. In this paper, we propose Prototype and Instance Contrastive Learning (PICL), a novel method for unsupervised domain adaptation in speaker verification through dual-level contrastive learning. For prototype contrastive learning, we generate pseudo labels via clustering to create dynamically updated prototype representations, aligning instances with their corresponding class or cluster prototypes. For instance contrastive learning, we minimize the distance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing

MethodsContrastive Learning