Multi-Domain Adaptation by Self-Supervised Learning for Speaker   Verification

Wan Lin; Lantian Li; Dong Wang

arXiv:2309.14149·cs.SD·September 26, 2023·1 cites

Multi-Domain Adaptation by Self-Supervised Learning for Speaker Verification

Wan Lin, Lantian Li, Dong Wang

PDF

Open Access

TL;DR

This paper introduces a self-supervised learning approach for multi-domain speaker verification, effectively handling complex real-world environments with multiple domains, outperforming traditional single-domain adaptation methods.

Contribution

The paper proposes three novel strategies to extend self-supervised adaptation for multi-domain speaker verification, addressing a gap in existing one-to-one domain adaptation techniques.

Findings

01

Outperforms basic self-supervised adaptation in multi-domain settings

02

Consistent improvements across in-domain and cross-domain tests

03

Effective handling of complex multi-domain environments

Abstract

In real-world applications, speaker recognition models often face various domain-mismatch challenges, leading to a significant drop in performance. Although numerous domain adaptation techniques have been developed to address this issue, almost all present methods focus on a simple configuration where the model is trained in one domain and deployed in another. However, real-world environments are often complex and may contain multiple domains, making the methods designed for one-to-one adaptation suboptimal. In our paper, we propose a self-supervised learning method to tackle this multi-domain adaptation problem. Building upon the basic self-supervised adaptation algorithm, we designed three strategies to make it suitable for multi-domain adaptation: an in-domain negative sampling strategy, a MoCo-like memory bank scheme, and a CORAL-like distribution alignment. We conducted experiments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Domain Adaptation and Few-Shot Learning · Natural Language Processing Techniques