# Distance-based Weight Transfer from Near-field to Far-field Speaker   Verification

**Authors:** Li Zhang, Qing Wang, Hongji Wang, Yue Li, Wei Rao, Yannan Wang, Lei, Xie

arXiv: 2303.00264 · 2023-03-16

## TL;DR

This paper introduces a weight transfer regularization loss to improve far-field speaker verification by leveraging near-field pre-trained models, addressing catastrophic forgetting and overfitting, and providing theoretical analysis of generalization bounds.

## Contribution

It proposes a novel WTR loss for better fine-tuning of speaker verification models, with theoretical analysis and evaluation on multiple datasets.

## Key findings

- WTR loss improves fine-tuning performance on far-field speech
- Theoretical analysis shows WTR tightens generalization bounds
- Experimental results confirm effectiveness on VoxCeleb and FFSVC datasets

## Abstract

The scarcity of labeled far-field speech is a constraint for training superior far-field speaker verification systems. Fine-tuning the model pre-trained on large-scale near-field speech substantially outperforms training from scratch. However, the fine-tuning method suffers from two limitations--catastrophic forgetting and overfitting. In this paper, we propose a weight transfer regularization(WTR) loss to constrain the distance of the weights between the pre-trained model with large-scale near-field speech and the fine-tuned model through a small number of far-field speech. With the WTR loss, the fine-tuning process takes advantage of the previously acquired discriminative ability from the large-scale near-field speech without catastrophic forgetting. Meanwhile, we use the PAC-Bayes generalization theory to analyze the generalization bound of the fine-tuned model with the WTR loss. The analysis result indicates that the WTR term makes the fine-tuned model have a tighter generalization upper bound. Moreover, we explore three kinds of norm distance for weight transfer, which are L1-norm distance, L2-norm distance and Max-norm distance. Finally, we evaluate the effectiveness of the WTR loss on VoxCeleb (pre-trained dataset) and FFSVC (fine-tuned dataset) datasets.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/2303.00264/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/2303.00264/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/2303.00264/full.md

---
Source: https://tomesphere.com/paper/2303.00264