Efficient Black-Box Speaker Verification Model Adaptation with Reprogramming and Backend Learning
Jingyu Li, Tan Lee

TL;DR
This paper presents a black-box domain adaptation method for speaker verification that uses input reprogramming and backend learning, achieving efficient adaptation with minimal additional parameters and computational cost.
Contribution
It introduces a novel black-box adaptation approach for DNN-based speaker verification using input reprogramming and backend learning, avoiding full model fine-tuning.
Findings
Achieves comparable or better performance than full fine-tuning in language mismatch scenarios.
Uses fewer parameters and less computation, demonstrating efficiency.
Effective domain adaptation without accessing or modifying the original model's internal parameters.
Abstract
The development of deep neural networks (DNN) has significantly enhanced the performance of speaker verification (SV) systems in recent years. However, a critical issue that persists when applying DNN-based SV systems in practical applications is domain mismatch. To mitigate the performance degradation caused by the mismatch, domain adaptation becomes necessary. This paper introduces an approach to adapt DNN-based SV models by manipulating the learnable model inputs, inspired by the concept of adversarial reprogramming. The pre-trained SV model remains fixed and functions solely in the forward process, resembling a black-box model. A lightweight network is utilized to estimate the gradients for the learnable parameters at the input, which bypasses the gradient backpropagation through the black-box model. The reprogrammed output is processed by a two-layer backend learning module as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Natural Language Processing Techniques · Topic Modeling
